-
公开(公告)号:US20220174328A1
公开(公告)日:2022-06-02
申请号:US17107684
申请日:2020-11-30
Applicant: Google LLC
Inventor: George Dan Toderici , Fabian Julius Mentzer , Eirikur Thor Agustsson , Michael Tobias Tschannen
IPC: H04N19/91 , H04N19/124 , H04N19/154 , G06N3/08 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network configured to receive a data item and to process the data item to output a compressed representation of the data item. In one aspect, a method includes, for each training data item: processing the data item using the encoder neural network to generate a latent representation of the training data item; processing the latent representation using a hyper-encoder neural network to determine a conditional entropy model; generating a compressed representation of the training data item; processing the compressed representation using a decoder neural network to generate a reconstruction of the training data item; processing the reconstruction of the training data item using a discriminator neural network to generate a discriminator network output; evaluating a first loss function; and determining an update to the current values of the encoder network parameters.
-
公开(公告)号:US11200423B2
公开(公告)日:2021-12-14
申请号:US16687118
申请日:2019-11-18
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US11042553B2
公开(公告)日:2021-06-22
申请号:US15819050
申请日:2017-11-21
Applicant: GOOGLE LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Weilong Yang , John Burge , Sanketh Shetty , Omid Madani
IPC: G06F16/00 , G06F16/2457 , G06F16/28 , G06F16/78 , G06F40/169 , G06F40/295
Abstract: Facilitating of content entity annotation while maintaining joint quality, coverage and/or completeness performance conditions is provided. In one example, a non-transitory computer-readable medium comprises computer-readable instructions that, in response to execution, cause a computing system to perform operations. The operations include aggregating information indicative of initial entities for content and initial scores associated with the initial entities received from one or more content annotation sources and mapping the initial scores to respective values to generate calibrated scores. The operations include applying weights to the calibrated scores to generate weighted scores and combining the weighted scores using a linear aggregation model to generate a final score. The operations include determining whether to annotate the content with at least one of the initial entities based on a comparison of the final score and a defined threshold value.
-
公开(公告)号:US20210166035A1
公开(公告)日:2021-06-03
申请号:US17120525
申请日:2020-12-14
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susana Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
IPC: G06K9/00 , G06F16/783
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
公开(公告)号:US20200027247A1
公开(公告)日:2020-01-23
申请号:US16515586
申请日:2019-07-18
Applicant: Google LLC
Inventor: David Charles Minnen , Saurabh Singh , Johannes Balle , Troy Chinen , Sung Jin Hwang , Nicholas Johnston , George Dan Toderici
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing and decompressing data. In one aspect, a method comprises: processing data using an encoder neural network to generate a latent representation of the data; processing the latent representation of the data using a hyper-encoder neural network to generate a latent representation of an entropy model; generating an entropy encoded representation of the latent representation of the entropy model; generating an entropy encoded representation of the latent representation of the data using the latent representation of the entropy model; and determining a compressed representation of the data from the entropy encoded representations of: (i) the latent representation of the data and (ii) the latent representation of the entropy model used to entropy encode the latent representation of the data.
-
公开(公告)号:US20190356330A1
公开(公告)日:2019-11-21
申请号:US15985340
申请日:2018-05-21
Applicant: Google LLC
Inventor: David Charles Minnen , Michele Covell , Saurabh Singh , Sung Jin Hwang , George Dan Toderici
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compressing and decompressing data. In one aspect, an encoder neural network processes data to generate an output including a representation of the data as an ordered collection of code symbols. The ordered collection of code symbols is entropy encoded using one or more code symbol probability distributions. A compressed representation of the data is determined based on the entropy encoded representation of the collection of code symbols and data indicating the code symbol probability distributions used to entropy encode the collection of code symbols. In another aspect, a compressed representation of the data is decoded to determine the collection of code symbols representing the data. A reconstruction of the data is determined by processing the collection of code symbols by a decoder neural network.
-
公开(公告)号:US10482328B2
公开(公告)日:2019-11-19
申请号:US15722756
申请日:2017-10-02
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US10192327B1
公开(公告)日:2019-01-29
申请号:US15424711
申请日:2017-02-03
Applicant: Google LLC
Inventor: George Dan Toderici , Sean O'Malley , Rahul Sukthankar , Sung Jin Hwang , Damien Vincent , Nicholas Johnston , David Charles Minnen , Joel Shor , Michele Covell
Abstract: Methods, and systems, including computer programs encoded on computer storage media for compressing data items with variable compression rate. A system includes an encoder sub-network configured to receive a system input image and to generate an encoded representation of the system input image, the encoder sub-network including a first stack of neural network layers including one or more LSTM neural network layers and one or more non-LSTM neural network layers, the first stack configured to, at each of a plurality of time steps, receive an input image for the time step that is derived from the system input image and generate a corresponding first stack output, and a binarizing neural network layer configured to receive a first stack output as input and generate a corresponding binarized output.
-
公开(公告)号:US20180089200A1
公开(公告)日:2018-03-29
申请号:US15819050
申请日:2017-11-21
Applicant: GOOGLE LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Weilong Yang , John Burge , Sanketh Shetty , Omid Madani
CPC classification number: G06F16/24578 , G06F16/285 , G06F16/78 , G06F17/241 , G06F17/278
Abstract: Facilitating of content entity annotation while maintaining joint quality, coverage and/or completeness performance conditions is provided. In one example, a non-transitory computer-readable medium comprises computer-readable instructions that, in response to execution, cause a computing system to perform operations. The operations include aggregating information indicative of initial entities for content and initial scores associated with the initial entities received from one or more content annotation sources and mapping the initial scores to respective values to generate calibrated scores. The operations include applying weights to the calibrated scores to generate weighted scores and combining the weighted scores using a linear aggregation model to generate a final score. The operations include determining whether to annotate the content with at least one of the initial entities based on a comparison of the final score and a defined threshold value.
-
-
-
-
-
-
-
-