-
公开(公告)号:US12131130B2
公开(公告)日:2024-10-29
申请号:US18105159
申请日:2023-02-02
Applicant: QUALCOMM Incorporated
Inventor: Rexford Alan Hill , Aaron Douglass Lamb , Michael Goldfarb , Amin Ansari , Christopher Lott
CPC classification number: G06F7/5443 , G06F5/06 , G06N3/063
Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
-
公开(公告)号:US11586417B2
公开(公告)日:2023-02-21
申请号:US16147297
申请日:2018-09-28
Applicant: QUALCOMM Incorporated
Inventor: Rexford Hill , Aaron Lamb , Michael Goldfarb , Amin Ansari , Christopher Lott
Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
-
公开(公告)号:US11763141B2
公开(公告)日:2023-09-19
申请号:US17713176
申请日:2022-04-04
Applicant: Qualcomm Incorporated
Inventor: Jinxia Bai , Rosario Cammarota , Michael Goldfarb
CPC classification number: G06N3/063 , G06F13/28 , G06F15/7825
Abstract: A neural processing unit (NPU) is described. The NPU includes an NPU direct memory access (NDMA) core. The NDMA core includes a read engine having a read buffer. The NDMA core also includes a write engine having a write buffer. The NPU also includes a controller. The controller is configured to direct the NDMA core to perform hardware memory bandwidth optimization for reading/writing NDMA data in the read buffer and/or NDMA data in the write buffer. The NDMA core is also configured to transparently combine NDMA transaction requests for a data stripe to increase local access to available tensors in artificial neural networks.
-
公开(公告)号:US11769036B2
公开(公告)日:2023-09-26
申请号:US15956674
申请日:2018-04-18
Applicant: QUALCOMM Incorporated
Inventor: Rosario Cammarota , Michael Goldfarb , Manu Rastogi , Sarang Ozarde
Abstract: An apparatus for optimizing a computational network is configure to receive an input at a first processing component. The first processing component may include at least a first programmable processing component and a second programmable processing component. The first programmable processing component is configured to compute a first nonlinear function and the second programmable processing component is configured to compute a second nonlinear function which is different than the second nonlinear function. The computational network which may be a recurrent neural network such as a long short-term memory may be operated to generate an inference based at least in part on outputs of the first programmable processing component and the second programmable processing component.
-
5.
公开(公告)号:US11861484B2
公开(公告)日:2024-01-02
申请号:US16147189
申请日:2018-09-28
Applicant: QUALCOMM Incorporated
Inventor: Jinxia Bai , Rosario Cammarota , Michael Goldfarb
CPC classification number: G06N3/063 , G06F9/30098 , G06F15/7825
Abstract: A neural processing unit (NPU) is described. The NPU includes an NPU direct memory access (NDMA) core. The NDMA core includes a read engine having a read buffer. The NDMA core also includes a write engine having a write buffer. The NPU also includes a controller. The controller is configured to direct the NDMA core to perform hardware pre-processing of NDMA data in the read buffer and post-processing of NDMA data in the write buffer on blocks of a data stripe to process tensors in artificial neural networks.
-
公开(公告)号:US11295205B2
公开(公告)日:2022-04-05
申请号:US16147245
申请日:2018-09-28
Applicant: QUALCOMM Incorporated
Inventor: Jinxia Bai , Rosario Cammarota , Michael Goldfarb
Abstract: A neural processing unit (NPU) is described. The NPU includes an NPU direct memory access (NDMA) core. The NDMA core includes a read engine having a read buffer. The NDMA core also includes a write engine having a write buffer. The NPU also includes a controller. The controller is configured to direct the NDMA core to perform hardware memory bandwidth optimization for reading/writing NDMA data in the read buffer and/or NDMA data in the write buffer. The NDMA core is also configured to transparently combine NDMA transaction requests for a data stripe to increase local access to available tensors in artificial neural networks.
-
-
-
-
-