Visual indicators of generative model response details

    公开(公告)号:US12266065B1

    公开(公告)日:2025-04-01

    申请号:US18409268

    申请日:2024-01-10

    Applicant: Google LLC

    Abstract: Systems and methods for providing visual indications of generative model responses can include obtaining a user input and processing the user input with a generative model to generate a model-generated-response. The systems and methods can process the model-generated response and an image of an environment to generate an augmented image. The augmented image can include visual indicators of the model-generated response, which can include annotating the image based on detected features within the image. Generation of the augmented image can include object detection and annotation based on the content of the model-generated response.

    EFFICIENTLY AUGMENTING IMAGES WITH RELATED CONTENT

    公开(公告)号:US20220121331A1

    公开(公告)日:2022-04-21

    申请号:US17563695

    申请日:2021-12-28

    Applicant: Google LLC

    Abstract: The subject matter of this specification generally relates to providing content related to text depicted in images. In one aspect, a system includes a data processing apparatus configured to extract text from an image. The extracted text is partitioned into multiple blocks. The multiple blocks are presented as respective first user-selectable targets on a user interface at a first zoom level. A user selection of a first block of the multiple blocks is detected. In response to detecting the user selection of the first block, portions of the extracted text in the first block are presented as respective second user-selectable targets on the user interface at a second zoom level greater than the first zoom level. In response to detecting a user selection of a portion of the extracted text within the first block, an action is initiated based on content of the user-selected text.

    User verification of a generative response to a multimodal query

    公开(公告)号:US12277635B1

    公开(公告)日:2025-04-15

    申请号:US18532470

    申请日:2023-12-07

    Applicant: Google LLC

    Abstract: A multimodal search system is described. The system can receive image data from a user device. Additionally, the system can receive a prompt associated with the image data. Moreover, the system can determine, using a computer vision model, a first object in the image data that is associated with the prompt. Furthermore, the system can receive, from the user device, a user indication on whether the image data includes the first object. Subsequently, in response to receiving the user indication, the system can generate a response using a large language model.

Patent Agency Ranking