-
公开(公告)号:US11308354B1
公开(公告)日:2022-04-19
申请号:US16834997
申请日:2020-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Ron Litman , Oron Anschel , Shahar Tsiper , Roee Litman , Shai Mazor , Jonathan Wu , Raghavan Manmatha
Abstract: Techniques for recognizing text in an image are described. An exemplary method may include receiving a request to recognize text in an image; extracting features from the image and generating a visual feature sequence from the extracted features; performing selective contextual refinement at least one selective contextual refinement block of a stack of selective contextual refinement blocks to generate a text prediction by: generating a contextual feature map and combining the contextual feature map with the visual feature sequence into a visual feature space, and applying a selective decoder that utilizes a two-step attention on the visual feature space to generate a text prediction, wherein the two-step attention includes performing a 1-D self-attention computation to generate attentional features and decoding the attentional features to generate the text prediction; and outputting the generated text prediction.
-
公开(公告)号:US11257006B1
公开(公告)日:2022-02-22
申请号:US16196662
申请日:2018-11-20
Applicant: Amazon Technologies, Inc.
Inventor: Oron Anschel , Amit Adam , Shahar Tsiper , Hadar Averbuch Elor , Shai Mazor , Rahul Bhotika , Stefano Soatto
IPC: G06N20/00 , G06K9/00 , G06F40/169
Abstract: Techniques for auto-generation of annotated real-world training data are described. An electronic document is analyzed to determine text represented in the document and corresponding locations of the text. A representation of the electronic document is modified to include markers and printed. The printed document is photographed in real-world environments, and the markers within the digital photographs are analyzed to allow for the depiction of the document within the photographs to be rectified. The text and location data are used to annotate the rectified images.
-
3.
公开(公告)号:US10970530B1
公开(公告)日:2021-04-06
申请号:US16189633
申请日:2018-11-13
Applicant: Amazon Technologies, Inc.
Inventor: Amit Adam , Oron Anschel , Or Perel , Gal Sabina Star , Omri Ben-Eliezer , Hadar Averbuch Elor , Shai Mazor , Wendy Tse , Andrea Olgiati , Rahul Bhotika , Stefano Soatto
IPC: G06K9/62 , G06F40/137 , G06F40/169 , G06F40/174 , G06K9/00 , G06N20/00
Abstract: Techniques for grammar-based automated generation of annotated synthetic form training data for machine learning are described. A training data generation engine utilizes a defined grammar to construct a layout for a form, select key-value units to place within the layout, and select attribute variants for the key-value units. The form is rendered and stored at a storage location, where it can be provided along with other similarly-generated forms to be used as training data for a machine learning model.
-
-