Patent search ap:("Google LLC") AND inv:"Dirk Weissenborn" Page 2

11.

发明公开
OPEN-VOCABULARY OBJECT DETECTION IN IMAGES 审中-公开

公开(公告)号：US20230360365A1

公开(公告)日：2023-11-09

申请号：US18144045

申请日：2023-05-05

Applicant: Google LLC

Inventor： Matthias Johannes Lorenz Minderer , Alexey Alexeevich Gritsenko , Austin Charles Stone , Dirk Weissenborn , Alexey Dosovitskiy , Neil Matthew Tinmouth Houlsby

IPC: G06V10/764 , G06F40/40 , G06V10/82 , G06V10/22 , G06V10/774 , G06V10/776 , G06V10/74

CPC classification number: G06V10/764 , G06F40/40 , G06V10/82 , G06V10/225 , G06V10/774 , G06V10/776 , G06V10/761

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object detection. In one aspect, a method comprises: obtaining: (i) an image, and (ii) a set of one or more query embeddings, wherein each query embedding represents a respective category of object; processing the image and the set of query embeddings using an object detection neural network to generate object detection data for the image, comprising: processing the image using an image encoding subnetwork of the object detection neural network to generate a set of object embeddings; processing each object embedding using a localization subnetwork to generate localization data defining a corresponding region of the image; and processing: (i) the set of object embeddings, and (ii) the set of query embeddings, using a classification subnetwork to generate, for each object embedding, a respective classification score distribution over the set of query embeddings.

12.

发明授权
Processing images using self-attention based neural networks 有权

公开(公告)号：US12125247B2

公开(公告)日：2024-10-22

申请号：US17492537

申请日：2021-10-01

Applicant: Google LLC

Inventor： Neil Matthew Tinmouth Houlsby , Sylvain Gelly , Jakob D. Uszkoreit , Xiaohua Zhai , Georg Heigold , Lucas Klaus Beyer , Alexander Kolesnikov , Matthias Johannes Lorenz Minderer , Dirk Weissenborn , Mostafa Dehghani , Alexey Dosovitskiy , Thomas Unterthiner

IPC: G06T7/00 , G06F18/24 , G06N3/045 , G06N3/08

CPC classification number: G06T7/97 , G06F18/24 , G06N3/045 , G06N3/08 , G06T2207/20081 , G06T2207/20084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using self-attention based neural networks. One of the methods includes obtaining one or more images comprising a plurality of pixels; determining, for each image of the one or more images, a plurality of image patches of the image, wherein each image patch comprises a different subset of the pixels of the image; processing, for each image of the one or more images, the corresponding plurality of image patches to generate an input sequence comprising a respective input element at each of a plurality of input positions, wherein a plurality of the input elements correspond to respective different image patches; and processing the input sequences using a neural network to generate a network output that characterizes the one or more images, wherein the neural network comprises one or more self-attention neural network layers.

13.

发明授权
Open-vocabulary object detection in images 有权

公开(公告)号：US11928854B2

公开(公告)日：2024-03-12

申请号：US18144045

申请日：2023-05-05

Applicant: Google LLC

Inventor： Matthias Johannes Lorenz Minderer , Alexey Alexeevich Gritsenko , Austin Charles Stone , Dirk Weissenborn , Alexey Dosovitskiy , Neil Matthew Tinmouth Houlsby

IPC: G06K9/00 , G06F40/40 , G06V10/22 , G06V10/74 , G06V10/764 , G06V10/774 , G06V10/776 , G06V10/82

CPC classification number: G06V10/764 , G06F40/40 , G06V10/225 , G06V10/761 , G06V10/774 , G06V10/776 , G06V10/82

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object detection. In one aspect, a method comprises: obtaining: (i) an image, and (ii) a set of one or more query embeddings, wherein each query embedding represents a respective category of object; processing the image and the set of query embeddings using an object detection neural network to generate object detection data for the image, comprising: processing the image using an image encoding subnetwork of the object detection neural network to generate a set of object embeddings; processing each object embedding using a localization subnetwork to generate localization data defining a corresponding region of the image; and processing: (i) the set of object embeddings, and (ii) the set of query embeddings, using a classification subnetwork to generate, for each object embedding, a respective classification score distribution over the set of query embeddings.

14.

发明公开
PROCESSING IMAGES USING SELF-ATTENTION BASED NEURAL NETWORKS 审中-公开

公开(公告)号：US20240062426A1

公开(公告)日：2024-02-22

申请号：US18500034

申请日：2023-11-01

Applicant: Google LLC

Inventor： Neil Matthew Tinmouth Houlsby , Sylvain Gelly , Jakob D. Uszkoreit , Xiaohua Zhai , Georg Heigold , Lucas Klaus Beyer , Alexander Kolesnikov , Matthias Johannes Lorenz Minderer , Dirk Weissenborn , Mostafa Dehghani , Alexey Dosovitskiy , Thomas Unterthiner

IPC: G06T7/00 , G06F18/24 , G06N3/045 , G06N3/08

CPC classification number: G06T7/97 , G06F18/24 , G06N3/045 , G06N3/08 , G06T2207/20081 , G06T2207/20084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using self-attention based neural networks. One of the methods includes obtaining one or more images comprising a plurality of pixels; determining, for each image of the one or more images, a plurality of image patches of the image, wherein each image patch comprises a different subset of the pixels of the image; processing, for each image of the one or more images, the corresponding plurality of image patches to generate an input sequence comprising a respective input element at each of a plurality of input positions, wherein a plurality of the input elements correspond to respective different image patches; and processing the input sequences using a neural network to generate a network output that characterizes the one or more images, wherein the neural network comprises one or more self-attention neural network layers.

15.

发明申请
AUTO-REGRESSIVE VIDEO GENERATION NEURAL NETWORKS 有权

公开(公告)号：US20220215594A1

公开(公告)日：2022-07-07

申请号：US17609668

申请日：2020-05-22

Applicant: Google LLC

Inventor： Oscar Carl Tackstrom , Jakob D. Uszkoreit , Dirk Weissenborn

IPC: G06T9/00 , G06N3/04

Abstract: A method for generating a video is described. The method includes: generating an initial output video including multiple frames, each of the frames having multiple channels; identifying a partitioning of the initial output video into a set of channel slices that are indexed according to a particular slice order, each channel slice being a down sampling of a channel stack from a set of channel stacks; initializing, for each channel stack in the set of channel stacks, a set of fully-generated channel slices; repeatedly processing, using an encoder and a decoder, a current output video to generate a next fully-generated channel slice to be added to the current set of fully-generated channel slices; generating, for each channel index, a respective fully-generated channel stack using the respective fully generated channel slices; and generating a fully-generated output video using the fully-generated channel stacks.

16.

发明申请
END-TO-END TRAINING OF NEURAL NETWORKS FOR IMAGE PROCESSING 有权

公开(公告)号：US20220172066A1

公开(公告)日：2022-06-02

申请号：US17538891

申请日：2021-11-30

Applicant: Google LLC

Inventor： Thomas Unterthiner , Alexey Dosovitskiy , Aravindh Mahendran , Dirk Weissenborn , Jakob D. Uszkoreit , Jean-Baptiste Cordonnier

IPC: G06N3/08 , G06N3/04 , G06K9/62 , G06K9/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to process images. One of the methods includes obtaining a training image; processing the training image using a first subnetwork to generate, for each of a plurality of first image patches of the training image, a relevance score; generating, using the relevance scores, one or more second image patches of the training image by performing one or more differentiable operations on the relevance scores; processing the one or more second image patches using a second subnetwork to generate a prediction about the training image; determining an error of the training network output; and generating a parameter update for the first subnetwork, comprising backpropagating gradients determined according to the error of the training network output through i) the second subnetwork, ii) the one or more differentiable operations, and iii) the first subnetwork.

17.

发明申请
Conditional Axial Transformer Layers for High-Fidelity Image Transformation 有权

公开(公告)号：US20220108423A1

公开(公告)日：2022-04-07

申请号：US17449162

申请日：2021-09-28

Applicant: Google LLC

Inventor： Manoj Kumar Sivaraj , Dirk Weissenborn , Nal Emmerich Kalchbrenner

IPC: G06T3/40 , G06N3/08

Abstract: Apparatus and methods relate to receiving an input image comprising an array of pixels, wherein the input image is associated with a first characteristic; applying a neural network to transform the input image to an output image associated with a second characteristic by generating, by an encoder and for each pixel of the array of pixels of the input image, an encoded pixel, providing, to a decoder, the array of encoded pixels, applying, by the decoder, axial attention to decode a given pixel, wherein the axial attention comprises a row attention or a column attention applied to one or more previously decoded pixels in rows or columns preceding a row or column associated with the given pixel, wherein the row or column attention mixes information within a respective row or column, and maintains independence between respective different rows or different columns; and generating, by the neural network, the output image.

18.

发明申请
Object-Centric Learning with Slot Attention 有权

公开(公告)号：US20210383199A1

公开(公告)日：2021-12-09

申请号：US16927018

申请日：2020-07-13

Applicant: Google LLC

Inventor： Dirk Weissenborn , Jakob Uszkoreit , Thomas Unterthiner , Aravindh Mahendran , Francesco Locatello , Thomas Kipf , Georg Heigold , Alexey Dosovitskiy

IPC: G06N3/063 , G06N3/08 , G06F17/16

Abstract: A method involves receiving a perceptual representation including a plurality of feature vectors, and initializing a plurality of slot vectors represented by a neural network memory unit. Each respective slot vector is configured to represent a corresponding entity in the perceptual representation. The method also involves determining an attention matrix based on a product of the plurality of feature vectors transformed by a key function and the plurality of slot vectors transformed by a query function. Each respective value of a plurality of values along each respective dimension of the attention matrix is normalized with respect to the plurality of values. The method additionally involves determining an update matrix based on the plurality of feature vectors transformed by a value function and the attention matrix, and updating the plurality of slot vectors based on the update matrix by way of the neural network memory unit.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification