-
公开(公告)号:US20240107079A1
公开(公告)日:2024-03-28
申请号:US18238068
申请日:2023-08-25
Applicant: Google LLC
Inventor: George Dan Toderici , Fabian Julius Mentzer , Eirikur Thor Agustsson , Michael Tobias Tschannen
IPC: H04N19/91 , G06N3/045 , G06N3/088 , H04N19/124 , H04N19/154
CPC classification number: H04N19/91 , G06N3/045 , G06N3/088 , H04N19/124 , H04N19/154
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network configured to receive a data item and to process the data item to output a compressed representation of the data item. In one aspect, a method includes, for each training data item: processing the data item using the encoder neural network to generate a latent representation of the training data item; processing the latent representation using a hyper-encoder neural network to determine a conditional entropy model; generating a compressed representation of the training data item; processing the compressed representation using a decoder neural network to generate a reconstruction of the training data item; processing the reconstruction of the training data item using a discriminator neural network to generate a discriminator network output; evaluating a first loss function; and determining an update to the current values of the encoder network parameters.
-
公开(公告)号:US12225239B2
公开(公告)日:2025-02-11
申请号:US18238068
申请日:2023-08-25
Applicant: Google LLC
Inventor: George Dan Toderici , Fabian Julius Mentzer , Eirikur Thor Agustsson , Michael Tobias Tschannen
IPC: G06V10/00 , G06N3/045 , G06N3/088 , H04N19/124 , H04N19/154 , H04N19/91
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network configured to receive a data item and to process the data item to output a compressed representation of the data item. In one aspect, a method includes, for each training data item: processing the data item using the encoder neural network to generate a latent representation of the training data item; processing the latent representation using a hyper-encoder neural network to determine a conditional entropy model; generating a compressed representation of the training data item; processing the compressed representation using a decoder neural network to generate a reconstruction of the training data item; processing the reconstruction of the training data item using a discriminator neural network to generate a discriminator network output; evaluating a first loss function; and determining an update to the current values of the encoder network parameters.
-
公开(公告)号:US20240169629A1
公开(公告)日:2024-05-23
申请号:US18513031
申请日:2023-11-17
Applicant: Google LLC
IPC: G06T11/60 , G06F40/284 , G06V10/74 , G06V10/764 , G06V10/774 , G06V10/776 , G06V20/70
CPC classification number: G06T11/60 , G06F40/284 , G06V10/761 , G06V10/764 , G06V10/774 , G06V10/776 , G06V20/70 , G06V10/945
Abstract: A first image and textual content associated with the first image is obtained. A second image that depicts the textual content associated with the first image is rendered. The first image and the second image are processed with a machine-learned encoding model to respectively obtain a first image embedding and a second image embedding for an image embedding space including a plurality of image embeddings. The machine-learned encoding model is trained based on a difference between the first image embedding and the second image embedding.
-
公开(公告)号:US20240169715A1
公开(公告)日:2024-05-23
申请号:US18518075
申请日:2023-11-22
Applicant: GOOGLE LLC
Inventor: Lucas Klaus Beyer , Pavel Izmailov , Simon Kornblith , Alexander Kolesnikov , Mathilde Caron , Xiaohua Zhai , Matthias Johannes Lorenz Minderer , Ibrahim Alabdulmohsin , Michael Tobias Tschannen , Filip Pavetic
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network that is configured to process an input image to generate a network output for the input image. In one aspect, a method comprises, at each of a plurality of training steps: obtaining a plurality of training images for the training step; obtaining, for each of the plurality of training images, a respective target output; and selecting, from a plurality of image patch generation schemes, an image patch generation scheme for the training step, wherein, given an input image, each of the plurality of image patch generation schemes generates a different number of patches of the input image, and wherein each patch comprises a respective subset of the pixels of the input image.
-
公开(公告)号:US11750848B2
公开(公告)日:2023-09-05
申请号:US17107684
申请日:2020-11-30
Applicant: Google LLC
Inventor: George Dan Toderici , Fabian Julius Mentzer , Eirikur Thor Agustsson , Michael Tobias Tschannen
IPC: G06V10/00 , H04N19/91 , H04N19/124 , G06N3/088 , H04N19/154 , G06N3/045
CPC classification number: H04N19/91 , G06N3/045 , G06N3/088 , H04N19/124 , H04N19/154
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network configured to receive a data item and to process the data item to output a compressed representation of the data item. In one aspect, a method includes, for each training data item: processing the data item using the encoder neural network to generate a latent representation of the training data item; processing the latent representation using a hyper-encoder neural network to determine a conditional entropy model; generating a compressed representation of the training data item; processing the compressed representation using a decoder neural network to generate a reconstruction of the training data item; processing the reconstruction of the training data item using a discriminator neural network to generate a discriminator network output; evaluating a first loss function; and determining an update to the current values of the encoder network parameters.
-
公开(公告)号:US20220174328A1
公开(公告)日:2022-06-02
申请号:US17107684
申请日:2020-11-30
Applicant: Google LLC
Inventor: George Dan Toderici , Fabian Julius Mentzer , Eirikur Thor Agustsson , Michael Tobias Tschannen
IPC: H04N19/91 , H04N19/124 , H04N19/154 , G06N3/08 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an encoder neural network configured to receive a data item and to process the data item to output a compressed representation of the data item. In one aspect, a method includes, for each training data item: processing the data item using the encoder neural network to generate a latent representation of the training data item; processing the latent representation using a hyper-encoder neural network to determine a conditional entropy model; generating a compressed representation of the training data item; processing the compressed representation using a decoder neural network to generate a reconstruction of the training data item; processing the reconstruction of the training data item using a discriminator neural network to generate a discriminator network output; evaluating a first loss function; and determining an update to the current values of the encoder network parameters.
-
-
-
-
-