-
公开(公告)号:US20220172456A1
公开(公告)日:2022-06-02
申请号:US17437238
申请日:2019-03-08
Applicant: Google LLC
Inventor: Jiang Wang , Jiyang Gao , Shengyang Dai
IPC: G06V10/764 , G06V10/84 , G06V10/20
Abstract: The present disclosure provides systems and methods that include or otherwise leverage an object detection training model for training a machine-learned object detection model. In particular, the training model can obtain first training data and train the machine-learned object detection model using the first training data. The training model can obtain second training data and input the second training data into the machine-learned object detection model, and receive as an output of the machine-learned object detection model, data that describes the location of a detected object of a target category within images from the second training data. The training model can determine mined training data based on the output of the machine-learned object detection model, and train the machine-learned object detection model based on the mined training data.
-
公开(公告)号:US11288719B2
公开(公告)日:2022-03-29
申请号:US16802864
申请日:2020-02-27
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method comprises: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair comprising key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
公开(公告)号:US10339419B2
公开(公告)日:2019-07-02
申请号:US16208518
申请日:2018-12-03
Applicant: Google LLC
Inventor: Yang Song , Jiang Wang , Charles J. Rosenberg
Abstract: Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on the first representation of features and the second representation of features, a first similarity measure for the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure for the the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; and adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets.
-
公开(公告)号:US10181091B2
公开(公告)日:2019-01-15
申请号:US15504870
申请日:2015-06-19
Applicant: Google LLC
Inventor: Yang Song , Jiang Wang , Charles J. Rosenberg
Abstract: Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on the first representation of features and the second representation of features, a first similarity measure for the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure for the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; and adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets.
-
-
-