-
公开(公告)号:US11257006B1
公开(公告)日:2022-02-22
申请号:US16196662
申请日:2018-11-20
Applicant: Amazon Technologies, Inc.
Inventor: Oron Anschel , Amit Adam , Shahar Tsiper , Hadar Averbuch Elor , Shai Mazor , Rahul Bhotika , Stefano Soatto
IPC: G06N20/00 , G06K9/00 , G06F40/169
Abstract: Techniques for auto-generation of annotated real-world training data are described. An electronic document is analyzed to determine text represented in the document and corresponding locations of the text. A representation of the electronic document is modified to include markers and printed. The printed document is photographed in real-world environments, and the markers within the digital photographs are analyzed to allow for the depiction of the document within the photographs to be rectified. The text and location data are used to annotate the rectified images.
-
公开(公告)号:US11341605B1
公开(公告)日:2022-05-24
申请号:US16588503
申请日:2019-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Kunwar Yashraj Singh , Amit Adam , Shahar Tsiper , Gal Sabina Star , Roee Litman , Hadar Averbuch Elor , Vijay Mahadevan , Rahul Bhotika , Shai Mazor , Mohammed El Hamalawi
Abstract: Techniques for document rectification via homography recovery using machine learning are described. An image rectification system can intelligently make use of multiple pipelines for rectifying document images based on the detected type of device that generated the images. The image rectification system can provide high-quality rectifications without requiring human cooperation, multiple views of the document in multiple images, and/or without being constrained to only be able to process images from one source context.
-
3.
公开(公告)号:US10970530B1
公开(公告)日:2021-04-06
申请号:US16189633
申请日:2018-11-13
Applicant: Amazon Technologies, Inc.
Inventor: Amit Adam , Oron Anschel , Or Perel , Gal Sabina Star , Omri Ben-Eliezer , Hadar Averbuch Elor , Shai Mazor , Wendy Tse , Andrea Olgiati , Rahul Bhotika , Stefano Soatto
IPC: G06K9/62 , G06F40/137 , G06F40/169 , G06F40/174 , G06K9/00 , G06N20/00
Abstract: Techniques for grammar-based automated generation of annotated synthetic form training data for machine learning are described. A training data generation engine utilizes a defined grammar to construct a layout for a form, select key-value units to place within the layout, and select attribute variants for the key-value units. The form is rendered and stored at a storage location, where it can be provided along with other similarly-generated forms to be used as training data for a machine learning model.
-
-