-
公开(公告)号:US10706322B1
公开(公告)日:2020-07-07
申请号:US15821629
申请日:2017-11-22
Applicant: Amazon Technologies, Inc.
Inventor: Shuo Yang , Hao Wu , Jonathan Wu , Meng Wang
Abstract: Embodiments of the present disclosure provide systems and processes for automatically determining a layout of text within an image that makes sense from a semantic perspective. In certain embodiments, the systems disclosed herein receive bounding box information relating to one or more bounding boxes that surround text within the image. The systems compare the received bounding box information to determine a clustering of bounding boxes that have an above threshold probability of including words that when read in order make sense semantically. For example, systems herein can determine whether words in a cluster correspond to a line of text.
-
公开(公告)号:US10423827B1
公开(公告)日:2019-09-24
申请号:US15641774
申请日:2017-07-05
Applicant: Amazon Technologies, Inc.
Inventor: Jonathan Wu , Meng Wang , Wei Xia , Ranju Das
Abstract: A method and system for analyzing text in an image. Classification and localization information is identified for the image at a word and character level. A detailed profile is generated that includes attributes of the words and characters identified in the image. One or more objects representing a predicted source of the text are identified in the image. In one embodiment, neural networks are employed to determine localization information and classification information associated with the identified object of interest (e.g., a text string, a character, or a text source).
-
公开(公告)号:US11481683B1
公开(公告)日:2022-10-25
申请号:US16888589
申请日:2020-05-29
Applicant: Amazon Technologies, Inc.
Inventor: Kunwar Yashraj Singh , Joaquin Zepeda Salvatierra , Erhan Bas , Vijay Mahadevan , Jonathan Wu , Rahul Bhotika
Abstract: Techniques for creating machine learning models for direct homography regression for image rectification are described. In certain embodiments, a training service trains an algorithm on a source view of a training image and a homography matrix of the training image into a machine learning model that generates a normalized homography matrix for an input of the source view. The normalized homography matrix may then be utilized to generate a target view of an image input into the machine learning model. The target view of the image may be used in a document processing pipeline for document images captured using cameras.
-
公开(公告)号:US12039027B1
公开(公告)日:2024-07-16
申请号:US17709262
申请日:2022-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Xiang Xu , Hao Zhou , Jonathan Wu , Joseph P Tighe
IPC: G06F21/32
CPC classification number: G06F21/32
Abstract: A system for evaluating a biometric authorization system is described. The biometric authorization system is configured to apply a facial recognition model to image data to make an authorization determination based on detection of synthesized image data and based on matching a reference image to the image data. The system is also configured to execute one or more synthetic image data attack protocols to evaluate the biometric authorization system. The system also generates, according to one or more synthetic image data generation techniques, an evaluation set of image data comprising synthesized representations of a target and sends one or more authorization requests using the evaluation set of image data to the biometric authorization system. The system generates an evaluation of the biometric authorization system for synthetic image data attack analysis based on respective responses to the one or more authorization requests received from the biometric authorization system.
-
公开(公告)号:US11308354B1
公开(公告)日:2022-04-19
申请号:US16834997
申请日:2020-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Ron Litman , Oron Anschel , Shahar Tsiper , Roee Litman , Shai Mazor , Jonathan Wu , Raghavan Manmatha
Abstract: Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.
-
公开(公告)号:US10572760B1
公开(公告)日:2020-02-25
申请号:US15810991
申请日:2017-11-13
Applicant: Amazon Technologies, Inc.
Inventor: Hao Wu , Jonathan Wu , Meng Wang , Wei Xia
Abstract: A method and system for analyzing text in an image is disclosed. A text localization and classification system accesses an annotated image comprising a plurality of text location identifiers for a given item of text. A neural network predicts the location of the given item of text using at least a first location identifier and a second location identifier. Optionally, the first location identifier comprises a first shape and the second location identifier comprises a second shape. A first loss is generated using a first loss function, the first loss corresponding to the predicated location using the first location identifier. A second loss is generated using a second loss function, the second loss corresponding to the predicated location using the second location identifier. The neural network is enhanced with backpropagation using the first loss and the second loss.
-
-
-
-
-