-
公开(公告)号:US20220414425A1
公开(公告)日:2022-12-29
申请号:US17821076
申请日:2022-08-19
Applicant: Google LLC
Inventor: Ming-Hsuan Yang , Xiaojie Jin , Joshua Foster Slocum , Shengyang Dai , Jiang Wang
IPC: G06N3/04 , G06N20/00 , G06F16/901
Abstract: Methods, and systems, including computer programs encoded on computer storage media for neural network architecture search. A method includes defining a neural network computational cell, the computational cell including a directed graph of nodes representing respective neural network latent representations and edges representing respective operations that transform a respective neural network latent representation; replacing each operation that transforms a respective neural network latent representation with a respective linear combination of candidate operations, where each candidate operation in a respective linear combination has a respective mixing weight that is parameterized by one or more computational cell hyper parameters; iteratively adjusting values of the computational cell hyper parameters and weights to optimize a validation loss function subject to computational resource constraints; and generating a neural network for performing a machine learning task using the defined computational cell and the adjusted values of the computational cell hyper parameters and weights.
-
公开(公告)号:US11816710B2
公开(公告)日:2023-11-14
申请号:US17653097
申请日:2022-03-01
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
IPC: G06V30/142 , G06Q30/04 , G06V30/412 , G06V30/414
CPC classification number: G06Q30/04 , G06V30/412 , G06V30/414
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method includes: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair including key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
公开(公告)号:US20190102651A1
公开(公告)日:2019-04-04
申请号:US16208518
申请日:2018-12-03
Applicant: Google LLC
Inventor: Yang Song , Jiang Wang , Charles J. Rosenberg
CPC classification number: G06K9/6215 , G06F16/51 , G06F16/5838 , G06K9/6212 , G06K9/627 , G06K9/66 , G06N3/0454 , G06N20/00
Abstract: Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on the first representation of features and the second representation of features, a first similarity measure for the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure for the the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; and adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets.
-
公开(公告)号:US11443162B2
公开(公告)日:2022-09-13
申请号:US16549715
申请日:2019-08-23
Applicant: Google LLC
Inventor: Ming-Hsuan Yang , Xiaojie Jin , Joshua Foster Slocum , Shengyang Dai , Jiang Wang
IPC: G06N3/04 , G06N20/00 , G06F16/901
Abstract: Methods, and systems, including computer programs encoded on computer storage media for neural network architecture search. A method includes defining a neural network computational cell, the computational cell including a directed graph of nodes representing respective neural network latent representations and edges representing respective operations that transform a respective neural network latent representation; replacing each operation that transforms a respective neural network latent representation with a respective linear combination of candidate operations, where each candidate operation in a respective linear combination has a respective mixing weight that is parameterized by one or more computational cell hyper parameters; iteratively adjusting values of the computational cell hyper parameters and weights to optimize a validation loss function subject to computational resource constraints; and generating a neural network for performing a machine learning task using the defined computational cell and the adjusted values of the computational cell hyper parameters and weights.
-
公开(公告)号:US10949708B2
公开(公告)日:2021-03-16
申请号:US16420154
申请日:2019-05-22
Applicant: Google LLC
Inventor: Yang Song , Jiang Wang , Charles J. Rosenberg
Abstract: Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on the first representation of features and the second representation of features, a first similarity measure for the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure for the the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; and adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets.
-
公开(公告)号:US20190279030A1
公开(公告)日:2019-09-12
申请号:US16420154
申请日:2019-05-22
Applicant: Google LLC
Inventor: Yang Song , Jiang Wang , Charles J. Rosenberg
Abstract: Methods, systems, and apparatus, for determining fine-grained image similarity. In one aspect, a method includes training an image embedding function on image triplets by selecting image triplets of first, second and third images; generating, by the image embedding function, a first, second and third representations of the features of the first, second and third images; determining, based on the first representation of features and the second representation of features, a first similarity measure for the first image to the second image; determining, based on the first representation of features and the third representation of features, a second similarity measure for the the first image to the third image; determining, based on the first and second similarity measures, a performance measure of the image embedding function for the image triplet; and adjusting the parameter weights of the image embedding function based on the performance measures for the image triplets.
-
公开(公告)号:US20220254137A1
公开(公告)日:2022-08-11
申请号:US17622462
申请日:2019-08-05
Applicant: Jilin TU , Jiang WANG , Huizhong CHEN , Xiangxin ZHU , Shengyang DAI , Google LLC
Inventor: Jilin Tu , Jiang Wang , Huizhong Chen , Xiangxin Zhu , Shengyang Dai
IPC: G06V10/50 , G06V10/778 , G06V10/75
Abstract: A computing system for detecting objects in an image can perform operations including generating an image pyramid that includes a first level corresponding with the image at a first resolution and a second level corresponding with the image at a second resolution. The operations can include tiling the first level and the second level by dividing the first level into a first plurality of tiles and the second level into a second plurality of tiles; inputting the first plurality of tiles and the second plurality of tiles into a machine-learned object detection model; receiving, as an output of the machine-learned object detection model, object detection data that includes bounding boxes respectively defined with respect to individual ones of the first plurality of tiles and the second plurality of tiles; and generating image object detection output by mapping the object detection data onto an image space of the image.
-
公开(公告)号:US20210056378A1
公开(公告)日:2021-02-25
申请号:US16549715
申请日:2019-08-23
Applicant: Google LLC
Inventor: Ming-Hsuan Yang , Xiaojie Jin , Joshua Foster Slocum , Shengyang Dai , Jiang Wang
IPC: G06N3/04 , G06N20/00 , G06F16/901
Abstract: Methods, and systems, including computer programs encoded on computer storage media for neural network architecture search. A method includes defining a neural network computational cell, the computational cell including a directed graph of nodes representing respective neural network latent representations and edges representing respective operations that transform a respective neural network latent representation; replacing each operation that transforms a respective neural network latent representation with a respective linear combination of candidate operations, where each candidate operation in a respective linear combination has a respective mixing weight that is parameterized by one or more computational cell hyper parameters; iteratively adjusting values of the computational cell hyper parameters and weights to optimize a validation loss function subject to computational resource constraints; and generating a neural network for performing a machine learning task using the defined computational cell and the adjusted values of the computational cell hyper parameters and weights.
-
公开(公告)号:US20200273078A1
公开(公告)日:2020-08-27
申请号:US16802864
申请日:2020-02-27
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method comprises: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair comprising key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
公开(公告)号:US20220309549A1
公开(公告)日:2022-09-29
申请号:US17653097
申请日:2022-03-01
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
IPC: G06Q30/04 , G06V30/412 , G06V30/414
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method includes: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair including key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
-
-
-
-
-
-
-
-