-
公开(公告)号:US20210081201A1
公开(公告)日:2021-03-18
申请号:US17107823
申请日:2020-11-30
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi
Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
-
公开(公告)号:US11977885B2
公开(公告)日:2024-05-07
申请号:US17107823
申请日:2020-11-30
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30101 , G06F9/3893 , G06F15/8046
Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
-
公开(公告)号:US20240320000A1
公开(公告)日:2024-09-26
申请号:US18621539
申请日:2024-03-29
Applicant: Intel Corporation
Inventor: Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30101 , G06F9/3893 , G06F15/8046
Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
-
公开(公告)号:US20220413803A1
公开(公告)日:2022-12-29
申请号:US17304803
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Jorge Parra , Fangwen Fu , Subramaniam Maiyuran , Varghese George , Mike Macpherson , Supratim Pal , Chandra Gurram , Sabareesh Ganapathy , Sasikanth Avancha , Dharma Teja Vooturi , Naveen Mellempudi , Dipankar Das
Abstract: A processing apparatus is described herein that includes a general-purpose parallel processing engine comprising a matrix accelerator including one or more systolic arrays, at least one of the one or more systolic arrays comprising multiple pipeline stages, each pipeline stage of the multiple pipeline stages including multiple processing elements, the multiple processing elements configured to perform processing operations on input matrix elements based on output sparsity metadata. The output sparsity metadata indicates to the multiple processing elements to bypass multiplication for a first row of elements of a second matrix and multiply a second row of elements of the second matrix with a column of matrix elements of a first matrix.
-
-
-