General padding support for convolution on systolic arrays

    公开(公告)号:US11449739B2

    公开(公告)日:2022-09-20

    申请号:US16548555

    申请日:2019-08-22

    Applicant: Google LLC

    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

    CROSS REPLICA REDUCTION ON NETWORKS HAVING DEGRADED NODES

    公开(公告)号:US20210049408A1

    公开(公告)日:2021-02-18

    申请号:US16543410

    申请日:2019-08-16

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for a network having one or more degraded nodes. A method comprises training a respective replica of a machine learning model on each node of multiple nodes organized in an n-dimensional network topology, combining the respective individual gradient vectors in the nodes to generate a final gradient vector by performing operations comprising: designating each group of nodes along the dimension as either a forwarding group or a critical group, updating, for each receiving node, a respective individual gradient vector with an intermediate gradient vector, performing a reduction on each critical group of nodes along the dimension to generate a respective partial final gradient vector for the critical group, and updating, for each critical group of nodes, an individual gradient vector for a representative node with the respective partial final gradient vector.

    GENERAL PADDING SUPPORT FOR CONVOLUTION ON SYSTOLIC ARRAYS

    公开(公告)号:US20220414441A1

    公开(公告)日:2022-12-29

    申请号:US17902776

    申请日:2022-09-02

    Applicant: Google LLC

    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

    GENERAL PADDING SUPPORT FOR CONVOLUTION ON SYSTOLIC ARRAYS

    公开(公告)号:US20210056396A1

    公开(公告)日:2021-02-25

    申请号:US16548555

    申请日:2019-08-22

    Applicant: Google LLC

    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.

    Cross replica reduction on networks having degraded nodes

    公开(公告)号:US11715010B2

    公开(公告)日:2023-08-01

    申请号:US16543410

    申请日:2019-08-16

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for a network having one or more degraded nodes. A method comprises training a respective replica of a machine learning model on each node of multiple nodes organized in an n-dimensional network topology, combining the respective individual gradient vectors in the nodes to generate a final gradient vector by performing operations comprising: designating each group of nodes along the dimension as either a forwarding group or a critical group, updating, for each receiving node, a respective individual gradient vector with an intermediate gradient vector, performing a reduction on each critical group of nodes along the dimension to generate a respective partial final gradient vector for the critical group, and updating, for each critical group of nodes, an individual gradient vector for a representative node with the respective partial final gradient vector.

Patent Agency Ranking