-
81.
公开(公告)号:US12008367B2
公开(公告)日:2024-06-11
申请号:US17845103
申请日:2022-06-21
Applicant: Intel Corporation
Inventor: Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30014 , G06F9/3016 , G06F9/3802
Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
-
公开(公告)号:US11954489B2
公开(公告)日:2024-04-09
申请号:US17549363
申请日:2021-12-13
Applicant: Intel Corporation
Inventor: Bret Toll , Christopher J. Hughes , Dan Baum , Elmoustapha Ould-Ahmed-Vall , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
IPC: G06F9/30
CPC classification number: G06F9/30145 , G06F9/30032 , G06F9/30036 , G06F9/30109
Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
-
公开(公告)号:US20230385059A1
公开(公告)日:2023-11-30
申请号:US18449651
申请日:2023-08-14
Applicant: Intel Corporation
Inventor: Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall
CPC classification number: G06F9/3001 , G06F9/30145 , G06F9/3005 , G06F9/30036 , G06F9/383 , G06F9/3016 , G06F9/30109 , G06F9/30123 , G06F9/30076 , G06F9/3824 , G06F9/30043
Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (M,N) of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the identified first source matrix by a corresponding nibble of a doubleword element (K,N) of the identified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element (M,N).
-
84.
公开(公告)号:US11816036B2
公开(公告)日:2023-11-14
申请号:US17738919
申请日:2022-05-06
Applicant: Intel Corporation
Inventor: Anil Vasudevan , Venkata Krishnan , Andrew J. Herdrich , Ren Wang , Robert G. Blankenship , Vedaraman Geetha , Shrikant M. Shah , Marshall A. Millier , Raanan Sade , Binh Q. Pham , Olivier Serres , Chyi-Chang Miao , Christopher B. Wilkerson
IPC: G06F12/0868 , G06F12/0897 , G06F3/06 , G06F12/0811 , G06F12/0871
CPC classification number: G06F12/0868 , G06F3/065 , G06F3/068 , G06F3/0619 , G06F12/0811 , G06F12/0871 , G06F12/0897 , G06F2212/1024 , G06F2212/283 , G06F2212/311 , G06F2212/6046
Abstract: Method and system for performing data movement operations is described herein. One embodiment of a method includes: storing data for a first memory address in a cache line of a memory of a first processing unit, the cache line associated with a coherency state indicating that the memory has sole ownership of the cache line; decoding an instruction for execution by a second processing unit, the instruction comprising a source data operand specifying the first memory address and a destination operand specifying a memory location in the second processing unit; and responsive to executing the decoded instruction, copying data from the cache line of the memory of the first processing unit as identified by the first memory address, to the memory location of the second processing unit, wherein responsive to the copy, the cache line is to remain in the memory and the coherency state is to remain unchanged.
-
公开(公告)号:US20230273846A1
公开(公告)日:2023-08-31
申请号:US18313905
申请日:2023-05-08
Applicant: Intel Corporation
Inventor: Tomer Stark , Ron Gabor , Joseph Nuzman , Raanan Sade , Bryant E. Bigbee
IPC: G06F11/07 , G06F12/00 , G06F9/38 , G06F12/109 , G06F21/60
CPC classification number: G06F11/0751 , G06F11/073 , G06F12/00 , G06F9/38 , G06F12/109 , G06F21/60 , G06F12/145
Abstract: Methods and apparatuses relating to memory corruption detection are described. In one embodiment, a hardware processor includes an execution unit to execute an instruction to request access to a block of a memory through a pointer to the block of the memory, and a memory management unit to allow access to the block of the memory when a memory corruption detection value in the pointer is validated with a memory corruption detection value in the memory for the block, wherein a position of the memory corruption detection value in the pointer is selectable between a first location and a second, different location.
-
公开(公告)号:US11714648B2
公开(公告)日:2023-08-01
申请号:US17549221
申请日:2021-12-13
Applicant: Intel Corporation
Inventor: Bret Toll , Christopher J. Hughes , Dan Baum , Elmoustapha Ould-Ahmed-Vall , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
IPC: G06F9/30
CPC classification number: G06F9/30145 , G06F9/30032 , G06F9/30036 , G06F9/30109
Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.
-
87.
公开(公告)号:US11500630B2
公开(公告)日:2022-11-15
申请号:US16872865
申请日:2020-05-12
Applicant: Intel Corporation
Inventor: Robert Valentine , Mark Charney , Raanan Sade , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal
Abstract: An embodiment of the invention is a processor including execution circuitry to, in response to a decoded instruction, convert a half-precision floating-point value to a single-precision floating-point value and store the single-precision floating-point value in each of the plurality of element locations of a destination register. The processor also includes a decoder and the destination register. The decoder is to decode an instruction to generate the decoded instruction.
-
88.
公开(公告)号:US20220261351A1
公开(公告)日:2022-08-18
申请号:US17738919
申请日:2022-05-06
Applicant: Intel Corporation
Inventor: Anil Vasudevan , Venkata Krishnan , Andrew J. Herdrich , Ren Wang , Robert G. Blankenship , Vedaraman Geetha , Shrikant M. Shah , Marshall A. Millier , Raanan Sade , Binh Q. Pham , Olivier Serres , Chyi-Chang Miao , Christopher B. Wilkerson
IPC: G06F12/0868 , G06F12/0897 , G06F3/06 , G06F12/0811 , G06F12/0871
Abstract: Method and system for performing data movement operations is described herein. One embodiment of a method includes: storing data for a first memory address in a cache line of a memory of a first processing unit, the cache line associated with a coherency state indicating that the memory has sole ownership of the cache line; decoding an instruction for execution by a second processing unit, the instruction comprising a source data operand specifying the first memory address and a destination operand specifying a memory location in the second processing unit; and responsive to executing the decoded instruction, copying data from the cache line of the memory of the first processing unit as identified by the first memory address, to the memory location of the second processing unit, wherein responsive to the copy, the cache line is to remain in the memory and the coherency state is to remain unchanged.
-
89.
公开(公告)号:US11372643B2
公开(公告)日:2022-06-28
申请号:US16186384
申请日:2018-11-09
Applicant: Intel Corporation
Inventor: Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich
Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
-
90.
公开(公告)号:US11366663B2
公开(公告)日:2022-06-21
申请号:US16186378
申请日:2018-11-09
Applicant: Intel Corporation
Inventor: Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich
Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
-
-
-
-
-
-
-
-
-