Abstract:
Systems and methods are provided for detecting and ranking entities identified in screen content displayed on a mobile device. For example, a method includes receiving an image captured from a mobile device display for a mobile application and determining a window that includes a chronological set of images, the images each representing a respective screen captured from a display of a mobile device and having an associated timestamp. The method also includes identifying entities appearing in images in a first portion of the window using text for images in a remaining portion of the window as context to disambiguate ambiguous entity references.
Abstract:
Systems and methods are provided for sharing a screen from a mobile device. For example, a method includes capturing an image of a screen displayed on the mobile device in response to a command to share the screen, receiving user instructions for redacting a portion of the image, and transmitting the image with the selected portion redacted to a recipient device selected by the user. As another example, a method includes receiving, from a first mobile device, an identifier for a recipient and an image representing a captured screen of a first mobile device, copying the image to an image repository associated with the recipient, performing recognition on the image, generating annotation data for the image, based on the recognition, that includes at least one visual cue, and providing the image and the annotation data to a second mobile device, the second mobile device being associated with the recipient.
Abstract:
A facial recognition search system identifies one or more likely names (or other personal identifiers) corresponding to the facial image(s) in a query as follows. After receiving the visual query with one or more facial images, the system identifies images that potentially match the respective facial image in accordance with visual similarity criteria. Then one or more persons associated with the potential images are identified. For each identified person, person-specific data comprising metrics of social connectivity to the requester are retrieved from a plurality of applications such as communications applications, social networking applications, calendar applications, and collaborative applications. An ordered list of persons is then generated by ranking the identified persons in accordance with at least metrics of visual similarity between the respective facial image and the potential image matches and with the social connection metrics. Finally, at least one person identifier from the list is sent to the requester.
Abstract:
A method, system, and computer readable storage medium is provided for identifying textual terms in response to a visual query is provided. A server system receives a visual query from a client system. The visual query is responded to as follows. A set of image feature values for the visual query is generated. The set of image feature values is mapped to a plurality of textual terms, including a weight for each of the textual terms in the plurality of textual terms. The textual terms are ranked in accordance with the weights of the textual terms. Then, in accordance with the ranking the textual terms, one or more of the ranked textual terms are sent to the client system.
Abstract:
A server system receives a visual query from a client system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system also produces structural information associated with the textual characters in the visual query. Textual characters in the plurality of textual characters are scored. The method further includes identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. A canonical document that includes the one or more high quality textual strings and that is consistent with the structural information is retrieved. At least a portion of the canonical document is sent to the client system.