-
公开(公告)号:US11379229B2
公开(公告)日:2022-07-05
申请号:US16987838
申请日:2020-08-07
Applicant: INTEL CORPORATION
Inventor: Jonathan Pearce , David Sheffield , Srikanth Srinivasan , Jeffrey Cook , Debbie Marr , Abhijit Davare , Asit Mishra , Steven Burns , Desmond A. Kirkpatrick , Andrey Ayupov , Anton Alexandrovich Sorokin , Eriko Nurvitadhi
IPC: G06F9/30 , G06F9/38 , G06F17/16 , G06F7/57 , G06F12/0831 , G06F12/084
Abstract: An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.