-
公开(公告)号:US20240362279A1
公开(公告)日:2024-10-31
申请号:US18306638
申请日:2023-04-25
Applicant: Google LLC
Inventor: Harshit Kharbanda , Belinda Luna Zeng , Viviana Caso Corella , Christopher James Kelley , Jessica Lee , Pendar Yousefi , Dounia Berrada , Sundeep Vaddadi , Kai Yu , Balint Miklos , Severin Heiniger , Louis Wang
IPC: G06F16/9532 , G06F16/538 , G06F40/40
CPC classification number: G06F16/9532 , G06F16/538 , G06F40/40
Abstract: A multimodal search system is described. The system can receive image data captured by a camera of a user device. Additionally, the system can receive audio data associated with the image data. The audio data can be captured by a microphone of the user device. Moreover, the system can process the image data to generate visual features. Furthermore, the system can process the audio data to generate a plurality of words. The system can generate a plurality of search terms based on the plurality of words and the visual features. Subsequently, the system can determine one or more search results associated with the plurality of search terms and provide the one or more search results as an output.
-
公开(公告)号:USD1048067S1
公开(公告)日:2024-10-22
申请号:US29866692
申请日:2022-09-23
Applicant: Google LLC
Designer: Christopher Kelley , Minsang Choi , Pritam Singh Pebam , Caroline Chilton , Carrie Linda Bisazza , Matthew Roth , Sabrina Curry , Natalie Michele Salaets , Jongwon Yu , Belinda Zeng , Harshit Kharbanda , Louis Wang , Austin Wu , Nishant Ranka , Morgane Magali Laure Sanglier
Abstract: The sole FIGURE is a front view of a display screen or portion thereof with graphical user interface showing the claimed design.
The outermost evenly spaced broken lines in the drawings show the electronic device, which is the environment of the design and forms no part of the claimed design. The dot-dash broken lines showing the display screen or portion thereof forms no part of the claimed design. The remaining broken lines showing portions of the graphical user interface form no part of the claimed design.-
公开(公告)号:USD1048066S1
公开(公告)日:2024-10-22
申请号:US29866691
申请日:2022-09-23
Applicant: Google LLC
Designer: Christopher Kelley , Minsang Choi , Pritam Singh Pebam , Caroline Chilton , Carrie Linda Bisazza , Matthew Roth , Sabrina Curry , Natalie Michele Salaets , Jongwon Yu , Belinda Zeng , Harshit Kharbanda , Louis Wang , Austin Wu , Nishant Ranka , Morgane Magali Laure Sanglier
Abstract: FIG. 1 is a front view of a display screen or portion thereof with transitional graphical user interface showing a first image of the claimed design.
FIG. 2 is a second image thereof; and,
FIG. 3 is a third image thereof.
The outermost evenly spaced broken lines in the drawings show the electronic device, which is the environment of the design and forms no part of the claimed design. The dot-dash broken lines showing the display screen or portion thereof forms no part of the claimed design. The remaining broken lines showing portions of the graphical user interface form no part of the claimed design.
The appearance of the transitional image sequentially transitions between the views of FIGS. 1-3. The process or period in which an image transitions to another forms no part of the claimed design.-
公开(公告)号:USD1048063S1
公开(公告)日:2024-10-22
申请号:US29866687
申请日:2022-09-23
Applicant: Google LLC
Designer: Christopher Kelley , Minsang Choi , Pritam Singh Pebam , Caroline Chilton , Carrie Linda Bisazza , Matthew Roth , Sabrina Curry , Natalie Michele Salaets , Jongwon Yu , Belinda Zeng , Harshit Kharbanda , Louis Wang , Austin Wu , Nishant Ranka , Morgane Magali Laure Sanglier
Abstract: FIG. 1 is a front view of a first embodiment of a display screen or portion thereof with transitional graphical user interface showing a first image of the claimed design.
FIG. 2 is a second image thereof;
FIG. 3 is a third image thereof;
FIG. 4 is a front view of a second embodiment of a display screen or portion thereof with transitional graphical user interface showing a first image of the claimed design.
FIG. 5 is a second image thereof; and,
FIG. 6 is a third image thereof.
The outermost evenly spaced broken lines in the drawings show the electronic device, which is the environment of the design and forms no part of the claimed design. The dot-dash broken lines showing the display screen or portion thereof forms no part of the claimed design. The remaining broken lines showing portions of the graphical user interface form no part of the claimed design.
The appearance of the transitional image sequentially transitions between the views of FIGS. 1-3 and FIGS. 4-6. The process or period in which an image transitions to another forms no part of the claimed design.-
公开(公告)号:US20230368527A1
公开(公告)日:2023-11-16
申请号:US18084710
申请日:2022-12-20
Applicant: Google LLC
Inventor: Jessica Lee , Christopher James Kelley , Alok Aggarwal , Harshit Kharbanda
IPC: G06V20/20 , G06T11/00 , G06V10/94 , G06F16/9535
CPC classification number: G06V20/20 , G06T11/00 , G06V10/945 , G06F16/9535 , G06T2200/24
Abstract: Systems and methods for providing scene understanding can include obtaining a plurality of images, stitching images associated with the scene, detecting objects in the scene, and providing information associated with the objects in the scene. The systems and methods can include determining filter tags or query tags that can be selected to filter the plurality of objects, which can then be provided as information to the user to provide further insight on the scene. The information may be provided in an augmented-reality experience via text or other user-interface elements anchored to objects in the images.
-
公开(公告)号:US20230259993A1
公开(公告)日:2023-08-17
申请号:US18165084
申请日:2023-02-06
Applicant: Google LLC
Inventor: Harshit Kharbanda , Christopher Kelley , Louis Wang
IPC: G06Q30/0282 , G06F16/953 , G06V20/20 , G06F3/0482 , G06F18/40 , G06F18/2113 , G06V20/00
CPC classification number: G06Q30/0282 , G06F16/953 , G06V20/20 , G06F3/0482 , G06F18/40 , G06F18/2113 , G06V20/00 , G06V30/10
Abstract: In a general aspect, a method can include receiving, by an electronic device, a visual scene; identifying, by the electronic device, a plurality of elements of the visual scene; and determining, based on the plurality of elements identified in the visual scene, a context of the visual scene. The method can further include applying, based on the determined context of the visual scene, at least one filter to identify at least one element of the plurality of elements corresponding with the at least one filter; and visually indicate, in the visual scene on a display of the electronic device, the at least one element identified using the at least one filter.
-
公开(公告)号:US20250140006A1
公开(公告)日:2025-05-01
申请号:US18620136
申请日:2024-03-28
Applicant: Google LLC
Inventor: Harshit Kharbanda , Boris Bluntschli , Vibhuti Mahajan , Louis Wang
IPC: G06V20/70 , G06V10/764 , G06V20/40
Abstract: Systems and methods for image understanding can include one or more object recognition systems and one or more vision language models to generate an augmented language output that can be both scene-aware and object-aware. The systems and methods can process an input image with an object recognition model to generate an object recognition output descriptive of identification details for an object depicted in the input image. The systems and methods can include processing the input image with a vision language model to generate a language output descriptive of a predicted scene description. The object recognition output can then be utilized to augment the language output to generate an augmented language output that includes the scene understanding of the language output with the specificity of the object recognition output.
-
公开(公告)号:US20250087207A1
公开(公告)日:2025-03-13
申请号:US18736113
申请日:2024-06-06
Applicant: Google LLC
Inventor: Harshit Kharbanda , Jessica Lee , Christopher James Kelley , Fabian Roth , Dounia Berrada , Samer Hassan Hassan , Afroz Mohiuddin , Misha Khalman , Ali Essam Ali Elqursh , Belinda Luna Zeng
IPC: G10L15/183 , G06F16/583 , G06V10/778 , G06V30/14 , G06V30/148 , G10L15/22 , G10L15/30
Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input. The computing system provides the second set of text for display to a user, wherein the second set of textual content is associated with the response type.
-
公开(公告)号:US20240378237A1
公开(公告)日:2024-11-14
申请号:US18314663
申请日:2023-05-09
Applicant: Google LLC
Inventor: Harshit Kharbanda , Jessica Lee , Christopher James Kelley , Belinda Luna Zeng , Louis Wang
IPC: G06F16/583 , G06V10/74
Abstract: Result images are retrieved based on a similarity to a query image. A set of textual inputs is processed with a machine-learned language model to obtain a language output comprising textual content, wherein the set of textual inputs comprises textual content from source documents that include the result images, and a prompt associated with the query image. The language output and the result images are provided to a user computing device. Information is received descriptive of an indication by a user that a first result image is visually dissimilar to the query image. Textual content associated with the source document that includes the first result image from the set of textual inputs is removed. The set of textual inputs is processed with the machine-learned language model to obtain a refined language output. The refined language output is provided to the user computing device.
-
公开(公告)号:USD1048065S1
公开(公告)日:2024-10-22
申请号:US29866690
申请日:2022-09-23
Applicant: Google LLC
Designer: Christopher Kelley , Minsang Choi , Pritam Singh Pebam , Caroline Chilton , Carrie Linda Bisazza , Matthew Roth , Sabrina Curry , Natalie Michele Salaets , Jongwon Yu , Belinda Zeng , Harshit Kharbanda , Louis Wang , Austin Wu , Nishant Ranka , Morgane Magali Laure Sanglier
Abstract: The sole FIGURE is a front view of a display screen or portion thereof with graphical user interface showing the claimed design.
The outermost evenly spaced broken lines in the drawings show the electronic device, which is the environment of the design and forms no part of the claimed design. The dot-dash broken lines showing the display screen or portion thereof forms no part of the claimed design. The remaining broken lines showing portions of the graphical user interface form no part of the claimed design.
-
-
-
-
-
-
-
-
-