-
公开(公告)号:US20220414425A1
公开(公告)日:2022-12-29
申请号:US17821076
申请日:2022-08-19
Applicant: Google LLC
Inventor: Ming-Hsuan Yang , Xiaojie Jin , Joshua Foster Slocum , Shengyang Dai , Jiang Wang
IPC: G06N3/04 , G06N20/00 , G06F16/901
Abstract: Methods, and systems, including computer programs encoded on computer storage media for neural network architecture search. A method includes defining a neural network computational cell, the computational cell including a directed graph of nodes representing respective neural network latent representations and edges representing respective operations that transform a respective neural network latent representation; replacing each operation that transforms a respective neural network latent representation with a respective linear combination of candidate operations, where each candidate operation in a respective linear combination has a respective mixing weight that is parameterized by one or more computational cell hyper parameters; iteratively adjusting values of the computational cell hyper parameters and weights to optimize a validation loss function subject to computational resource constraints; and generating a neural network for performing a machine learning task using the defined computational cell and the adjusted values of the computational cell hyper parameters and weights.
-
公开(公告)号:US11829404B2
公开(公告)日:2023-11-28
申请号:US17119546
申请日:2020-12-11
Applicant: Google LLC
Inventor: Shinko Cheng , Eunyoung Kim , Shengyang Dai , Madhur Khandelwal , Kristina Eng , David Loxton
IPC: G06F16/51 , G06F16/11 , G06F18/2433
CPC classification number: G06F16/51 , G06F16/113 , G06F18/2433
Abstract: Some implementations related to archiving of functional images. In some implementations, a method includes accessing images and determining one or more functional labels corresponding to each of the images and one or more confidence scores corresponding to the functional labels. A functional image score is determined for each of the images based on the functional labels having a corresponding confidence score that meets a respective threshold for the functional labels. In response to determining that the functional image score meets a functional image score threshold, a functional image signal is provided that indicates that one or more of the images that meet the functional image score threshold are functional images. The functional images are determined to be archived, and are archived by associating an archive attribute with the functional images such that functional images having the archive attribute are excluded from display in views of the images.
-
公开(公告)号:US11816710B2
公开(公告)日:2023-11-14
申请号:US17653097
申请日:2022-03-01
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
IPC: G06V30/142 , G06Q30/04 , G06V30/412 , G06V30/414
CPC classification number: G06Q30/04 , G06V30/412 , G06V30/414
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method includes: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair including key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
公开(公告)号:US10685680B2
公开(公告)日:2020-06-16
申请号:US16359928
申请日:2019-03-20
Applicant: Google LLC
Inventor: Shengyang Dai , Timothy Sepkoski St. Clair , Koji Ashida , Jingyu Cui , Jay Steele , Qi Gu , Erik Murphy-Chutorian , Ivan Neulander , Flavio Lerda , Eric Charles Henry , Shinko Yuanhsien Cheng , Aravind Krishnaswamy , David Cohen , Pardis Beikzadeh
IPC: H04N5/93 , G11B27/036 , G11B27/034 , G11B27/10 , H04N5/76 , G11B27/34 , G11B27/28 , H04N5/262
Abstract: A method includes grouping media items associated with a user into segments based on a timestamp associated with each media item and a total number of media items. The method also includes selecting target media from the media items for each of the segments based on media attributes associated with the media item. The method also includes generating a video that includes the target media for each of the segments by generating a first animation that illustrates a first transition from a first item from the target media to a second item from the target media with movement of the first item from an onscreen location to an offscreen location, wherein the first item is adjacent to the second item in the first animation and determining whether the target media includes one or more additional items. The method also includes adding a song to the video.
-
公开(公告)号:US20240362940A1
公开(公告)日:2024-10-31
申请号:US18306604
申请日:2023-04-25
Applicant: Google LLC
Inventor: Jing Xiong , Tianli Yu , Shengyang Dai
IPC: G06V30/19
CPC classification number: G06V30/1912 , G06V30/1916
Abstract: A method includes receiving, from a user device associated with a user, a plurality of annotated documents. Each respective annotated document includes one or more fields and each respective field labeled by a respective annotation. The method includes, for a threshold number of iterations, randomly selecting a respective subset of annotated documents from the plurality of annotated documents; training a respective model on the respective subset of annotated documents; and generating, using the plurality of annotated documents not selected for the respective subset of annotated documents, a respective evaluation of the respective model. The method also includes providing, to the user device, each respective evaluation.
-
公开(公告)号:US20220254137A1
公开(公告)日:2022-08-11
申请号:US17622462
申请日:2019-08-05
Applicant: Jilin TU , Jiang WANG , Huizhong CHEN , Xiangxin ZHU , Shengyang DAI , Google LLC
Inventor: Jilin Tu , Jiang Wang , Huizhong Chen , Xiangxin Zhu , Shengyang Dai
IPC: G06V10/50 , G06V10/778 , G06V10/75
Abstract: A computing system for detecting objects in an image can perform operations including generating an image pyramid that includes a first level corresponding with the image at a first resolution and a second level corresponding with the image at a second resolution. The operations can include tiling the first level and the second level by dividing the first level into a first plurality of tiles and the second level into a second plurality of tiles; inputting the first plurality of tiles and the second plurality of tiles into a machine-learned object detection model; receiving, as an output of the machine-learned object detection model, object detection data that includes bounding boxes respectively defined with respect to individual ones of the first plurality of tiles and the second plurality of tiles; and generating image object detection output by mapping the object detection data onto an image space of the image.
-
公开(公告)号:US20210056378A1
公开(公告)日:2021-02-25
申请号:US16549715
申请日:2019-08-23
Applicant: Google LLC
Inventor: Ming-Hsuan Yang , Xiaojie Jin , Joshua Foster Slocum , Shengyang Dai , Jiang Wang
IPC: G06N3/04 , G06N20/00 , G06F16/901
Abstract: Methods, and systems, including computer programs encoded on computer storage media for neural network architecture search. A method includes defining a neural network computational cell, the computational cell including a directed graph of nodes representing respective neural network latent representations and edges representing respective operations that transform a respective neural network latent representation; replacing each operation that transforms a respective neural network latent representation with a respective linear combination of candidate operations, where each candidate operation in a respective linear combination has a respective mixing weight that is parameterized by one or more computational cell hyper parameters; iteratively adjusting values of the computational cell hyper parameters and weights to optimize a validation loss function subject to computational resource constraints; and generating a neural network for performing a machine learning task using the defined computational cell and the adjusted values of the computational cell hyper parameters and weights.
-
公开(公告)号:US20200273078A1
公开(公告)日:2020-08-27
申请号:US16802864
申请日:2020-02-27
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method comprises: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair comprising key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
公开(公告)号:US20220309549A1
公开(公告)日:2022-09-29
申请号:US17653097
申请日:2022-03-01
Applicant: Google LLC
Inventor: Yang Xu , Jiang Wang , Shengyang Dai
IPC: G06Q30/04 , G06V30/412 , G06V30/414
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for converting unstructured documents to structured key-value pairs. In one aspect, a method includes: providing an image of a document to a detection model, wherein: the detection model is configured to process the image to generate an output that defines one or more bounding boxes generated for the image; and each bounding box generated for the image is predicted to enclose a key-value pair including key textual data and value textual data, wherein the key textual data defines a label that characterizes the value textual data; and for each of the one or more bounding boxes generated for the image: identifying textual data enclosed by the bounding box using an optical character recognition technique; and determining whether the textual data enclosed by the bounding box defines a key-value pair.
-
公开(公告)号:US20220172456A1
公开(公告)日:2022-06-02
申请号:US17437238
申请日:2019-03-08
Applicant: Google LLC
Inventor: Jiang Wang , Jiyang Gao , Shengyang Dai
IPC: G06V10/764 , G06V10/84 , G06V10/20
Abstract: The present disclosure provides systems and methods that include or otherwise leverage an object detection training model for training a machine-learned object detection model. In particular, the training model can obtain first training data and train the machine-learned object detection model using the first training data. The training model can obtain second training data and input the second training data into the machine-learned object detection model, and receive as an output of the machine-learned object detection model, data that describes the location of a detected object of a target category within images from the second training data. The training model can determine mined training data based on the output of the machine-learned object detection model, and train the machine-learned object detection model based on the mined training data.
-
-
-
-
-
-
-
-
-