-
公开(公告)号:US20220215654A1
公开(公告)日:2022-07-07
申请号:US17606976
申请日:2020-05-22
Applicant: Google LLC
Inventor: Jonathon Shlens , Ashish Teku Vaswani , Niki J. Parmar , Prajit Ramachandran , Anselm Caelifer Levskaya , Irwan Bello
Abstract: A system implemented as computer programs on one or more computers in one or more locations that implements a computer vision model is described. The computer vision model includes a positional local self-attention layer that is configured to receive an input feature map and to generate an output feature map. For each input element in the input feature map, the positional local self-attention layer generates a respective output element for the output feature map by generating a memory block including neighboring input elements around the input element, generates a query vector using the input element and a query weight matrix, for each neighboring element in the memory block, performs positional local self-attention operations to generate a temporary output element, and generates the respective output element by summing temporary output elements of the neighboring elements in the memory block.
-
公开(公告)号:US20210248472A1
公开(公告)日:2021-08-12
申请号:US17121161
申请日:2020-12-14
Applicant: Google LLC
Inventor: Gamaleldin Elsayed , Prajit Ramachandran , Jon Shlens , Simon Kornblith
Abstract: The present disclosure provides a neural network including one or more layers with relaxed spatial invariance. Each of the one or more layers can be configured to receive a respective layer input. Each of the one or more layers can be configured to convolve a plurality of different kernels against the respective layer input to generate a plurality of intermediate outputs, each of the plurality of intermediate outputs having a plurality of portions. Each of the one or more layers can be configured to apply, for each of the plurality of intermediate outputs, a respective plurality of weights respectively associated with the plurality of portions to generate a respective weighted output. Each of the one or more layers can be configured to generate a respective layer output based on the weighted outputs.
-
公开(公告)号:US12265911B2
公开(公告)日:2025-04-01
申请号:US17121161
申请日:2020-12-14
Applicant: Google LLC
Inventor: Gamaleldin Elsayed , Prajit Ramachandran , Jon Shlens , Simon Kornblith
Abstract: A computing system can include one or more non-transitory computer-readable media that collectively store a neural network including one or more layers with relaxed spatial invariance. Each of the one or more layers can be configured to receive a respective layer input. Each of the one or more layers can be configured to convolve a plurality of different kernels against the respective layer input to generate a plurality of intermediate outputs, each of the plurality of intermediate outputs having a plurality of portions. Each of the one or more layers can be configured to apply, for each of the plurality of intermediate outputs, a respective plurality of weights respectively associated with the plurality of portions to generate a respective weighted output. Each of the one or more layers can be configured to generate a respective layer output based on the weighted outputs.
-
公开(公告)号:US20210390410A1
公开(公告)日:2021-12-16
申请号:US17347416
申请日:2021-06-14
Applicant: Google LLC
Inventor: Ashish Teku Vaswani , Prajit Ramachandran , Aravind Srinivas Lakshminarayanan , Blake Alan Hechtman , Niki J. Parmar
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using a computer vision neural network that has one or more local self-attention layers. Each local self-attention layer is configured to apply or more local self-attention mechanisms to the layer input to the local self-attention layer.
-
-
-