-
公开(公告)号:US20240061742A1
公开(公告)日:2024-02-22
申请号:US18386641
申请日:2023-11-03
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Norman Paul Jouppi
CPC classification number: G06F11/1004 , G06F15/8046
Abstract: Aspects of the disclosure are directed to a computation unit implementing a systolic array and configured for detecting errors while processing data on the systolic array. Checksum circuit in communication with a systolic array is configured to compute checksums and perform error detection while the systolic array processes input data. Instead of pre-generating checksums in input matrices, input matrices can be directly fed into the systolic array through the checksum circuit. On the output side, the checksum circuit can generate and compare checksums with checksums in an output matrix generated by the systolic array. Error checking the operations to generate the output matrix can be performed without delaying the operations of the systolic array, and without preprocessing the input matrices.
-
公开(公告)号:US11853156B2
公开(公告)日:2023-12-26
申请号:US17961623
申请日:2022-10-07
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Norman Paul Jouppi
CPC classification number: G06F11/1004 , G06F15/8046
Abstract: Aspects of the disclosure are directed to a computation unit implementing a systolic array and configured for detecting errors while processing data on the systolic array. Checksum circuit in communication with a systolic array is configured to compute checksums and perform error detection while the systolic array processes input data. Instead of pre-generating checksums in input matrices, input matrices can be directly fed into the systolic array through the checksum circuit. On the output side, the checksum circuit can generate and compare checksums with checksums in an output matrix generated by the systolic array. Error checking the operations to generate the output matrix can be performed without delaying the operations of the systolic array, and without preprocessing the input matrices.
-
公开(公告)号:US12189472B2
公开(公告)日:2025-01-07
申请号:US18386641
申请日:2023-11-03
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Norman Paul Jouppi
Abstract: Aspects of the disclosure are directed to a computation unit implementing a systolic array and configured for detecting errors while processing data on the systolic array. Checksum circuit in communication with a systolic array is configured to compute checksums and perform error detection while the systolic array processes input data. Instead of pre-generating checksums in input matrices, input matrices can be directly fed into the systolic array through the checksum circuit. On the output side, the checksum circuit can generate and compare checksums with checksums in an output matrix generated by the systolic array. Error checking the operations to generate the output matrix can be performed without delaying the operations of the systolic array, and without preprocessing the input matrices.
-
公开(公告)号:US20230036421A1
公开(公告)日:2023-02-02
申请号:US17961623
申请日:2022-10-07
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Norman Paul Jouppi
Abstract: Aspects of the disclosure are directed to a computation unit implementing a systolic array and configured for detecting errors while processing data on the systolic array. Checksum circuit in communication with a systolic array is configured to compute checksums and perform error detection while the systolic array processes input data. Instead of pre-generating checksums in input matrices, input matrices can be directly fed into the systolic array through the checksum circuit. On the output side, the checksum circuit can generate and compare checksums with checksums in an output matrix generated by the systolic array. Error checking the operations to generate the output matrix can be performed without delaying the operations of the systolic array, and without preprocessing the input matrices.
-
公开(公告)号:US11915139B2
公开(公告)日:2024-02-27
申请号:US17672163
申请日:2022-02-15
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Nishant Patil , Norman Paul Jouppi
Abstract: Methods, systems, and apparatus for updating machine learning models to improve locality are described. In one aspect, a method includes receiving data of a machine learning model. The data represents operations of the machine learning model and data dependencies between the operations. Data specifying characteristics of a memory hierarchy for a machine learning processor on which the machine learning model is going to be deployed is received. The memory hierarchy includes multiple memories at multiple memory levels for storing machine learning data used by the machine learning processor when performing machine learning computations using the machine learning model. An updated machine learning model is generated by modifying the operations and control dependencies of the machine learning model to account for the characteristics of the memory hierarchy. Machine learning computations are performed using the updated machine learning model.
-
公开(公告)号:US20220172060A1
公开(公告)日:2022-06-02
申请号:US17672163
申请日:2022-02-15
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Nishant Patil , Norman Paul Jouppi
Abstract: Methods, systems, and apparatus for updating machine learning models to improve locality are described. In one aspect, a method includes receiving data of a machine learning model. The data represents operations of the machine learning model and data dependencies between the operations. Data specifying characteristics of a memory hierarchy for a machine learning processor on which the machine learning model is going to be deployed is received. The memory hierarchy includes multiple memories at multiple memory levels for storing machine learning data used by the machine learning processor when performing machine learning computations using the machine learning model. An updated machine learning model is generated by modifying the operations and control dependencies of the machine learning model to account for the characteristics of the memory hierarchy. Machine learning computations are performed using the updated machine learning model.
-
公开(公告)号:US11507452B1
公开(公告)日:2022-11-22
申请号:US17410558
申请日:2021-08-24
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Norman Paul Jouppi
Abstract: Aspects of the disclosure are directed to a computation unit implementing a systolic array and configured for detecting errors while processing data on the systolic array. Checksum circuit in communication with a systolic array is configured to compute checksums and perform error detection while the systolic array processes input data. Instead of pre-generating checksums in input matrices, input matrices can be directly fed into the systolic array through the checksum circuit. On the output side, the checksum circuit can generate and compare checksums with checksums in an output matrix generated by the systolic array. Error checking the operations to generate the output matrix can be performed without delaying the operations of the systolic array, and without preprocessing the input matrices.
-
公开(公告)号:US12197890B2
公开(公告)日:2025-01-14
申请号:US17377743
申请日:2021-07-16
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Lifeng Nai
Abstract: The subject matter described herein provides systems and techniques for the design and use of multiply-and-accumulate (MAC) units to perform matrix multiplication by systolic arrays, such as those used in accelerators for deep neural networks (DNNs). These MAC units may take advantage of the particular way in which matrix multiplication is performed within a systolic array. For example, when a matrix A is multiplied with a matrix B, the scalar value, a, of the matrix A is reused many times, the scalar value, b, of the matrix B may be streamed into the systolic array and forwarded to a series of MAC units in the systolic array, and only the final values and not the intermediate values of the dot products, computed for the matrix multiplication, may be correct. MAC unit hardware that is particularized to take advantage of these observations is described herein.
-
公开(公告)号:US20230015148A1
公开(公告)日:2023-01-19
申请号:US17377743
申请日:2021-07-16
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Lifeng Nai
Abstract: The subject matter described herein provides systems and techniques for the design and use of multiply-and-accumulate (MAC) units to perform matrix multiplication by systolic arrays, such as those used in accelerators for deep neural networks (DNNs). These MAC units may take advantage of the particular way in which matrix multiplication is performed within a systolic array. For example, when a matrix A is multiplied with a matrix B, the scalar value, a, of the matrix A is reused many times, the scalar value, b, of the matrix B may be streamed into the systolic array and forwarded to a series of MAC units in the systolic array, and only the final values and not the intermediate values of the dot products, computed for the matrix multiplication, may be correct. MAC unit hardware that is particularized to take advantage of these observations is described herein.
-
公开(公告)号:US11263529B2
公开(公告)日:2022-03-01
申请号:US16156573
申请日:2018-10-10
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Nishant Patil , Norman Paul Jouppi
Abstract: Methods, systems, and apparatus for updating machine learning models to improve locality are described. In one aspect, a method includes receiving data of a machine learning model. The data represents operations of the machine learning model and data dependencies between the operations. Data specifying characteristics of a memory hierarchy for a machine learning processor on which the machine learning model is going to be deployed is received. The memory hierarchy includes multiple memories at multiple memory levels for storing machine learning data used by the machine learning processor when performing machine learning computations using the machine learning model. An updated machine learning model is generated by modifying the operations and control dependencies of the machine learning model to account for the characteristics of the memory hierarchy. Machine learning computations are performed using the updated machine learning model.
-
-
-
-
-
-
-
-
-