-
公开(公告)号:US10649772B2
公开(公告)日:2020-05-12
申请号:US15941526
申请日:2018-03-30
Applicant: Intel Corporation
Inventor: Dennis Ryan Bradford , Jesus Corbal , Brian Hickmann , Rohan Sharma
Abstract: Disclosed embodiments relate to a method and apparatus for efficient matrix transpose. In one example, a processor to execute a matrix transpose instruction includes fetch circuitry to fetch the matrix transpose instruction specifying a destination matrix and a source matrix having (N×M) elements and (M×N) elements, respectively, a (N×M) load buffer, decode circuitry to decode the fetched matrix transpose instruction, and execution circuitry, responsive to the decoded matrix transpose instruction to, for each row X of M rows of the specified source matrix: fetch and buffer N elements of the row in a load register, and cause the N buffered elements to be written, in the same relative order as in the row, to column X of M columns of the load buffer, and the execution circuitry subsequently to write each of N rows of the load buffer to a same row of the load buffer.
-
2.
公开(公告)号:US09804842B2
公开(公告)日:2017-10-31
申请号:US14581535
申请日:2014-12-23
Applicant: Intel Corporation
Inventor: Jesus Corbal San Adrian , Dennis R. Bradford , Benjamin C. Chaffin , Taraneh Bahrami , Jonathan C. Hall , Thomas B. Maciukenas , Roger Gramunt , Rohan Sharma
CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30032 , G06F9/30072 , G06F9/30101 , G06F15/8084
Abstract: An apparatus and method for efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: a source mask register to be logically subdivided into at least a first portion to store a usable portion of a mask value and a second portion to store an indication of whether the usable portion of the mask value has been updated; a control register to store an unusable portion of the mask value; architectural state management logic to read the indication to determine whether the mask value has been updated prior to performing a store operation, wherein if the mask value has been updated, then the architectural state management logic is to read the usable portion of the mask value from the first portion of the source mask register and zero out bits of the unusable portion of the mask value to generate a final mask value to be saved to memory, and wherein if the mask value has not been updated, then the architectural state management logic is to concatenate the usable portion of the mask value with the unusable portion of the mask value read from the control register to generate a final mask value to be saved to memory.
-
3.
公开(公告)号:US20200310797A1
公开(公告)日:2020-10-01
申请号:US16370915
申请日:2019-03-30
Applicant: Intel Corporation
Inventor: Jesus Corbal , Rohan Sharma , Simon Steely, JR. , Chinmay Ashok , Kent D. Glossop , Dennis Bradford , Paul Caprioli , Louise Huot , Kermin ChoFleming , Barry Tannenbaum
Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.
-
公开(公告)号:US09606847B2
公开(公告)日:2017-03-28
申请号:US14575614
申请日:2014-12-18
Applicant: Intel Corporation
Inventor: Jesus San Adrian Corbal , Dennis R. Bradford , Rohan Sharma
CPC classification number: G06F11/07 , G06F11/0706 , G06F11/0751
Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for detecting and reporting errors in a machine check environment. A processing device includes an error monitoring module, which detects an error corresponding to data associated with execution of an instruction by the processing device and determines whether the error occurs on portion of the data that affects a result of the instruction. The processing device further enables error detection when it is determined that the error occurs on the portion of the data that affects the result of the execution of the instruction.
-
5.
公开(公告)号:US10817291B2
公开(公告)日:2020-10-27
申请号:US16370915
申请日:2019-03-30
Applicant: Intel Corporation
Inventor: Jesus Corbal , Rohan Sharma , Simon Steely, Jr. , Chinmay Ashok , Kent D. Glossop , Dennis Bradford , Paul Caprioli , Louise Huot , Kermin ChoFleming , Barry Tannenbaum
Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.
-
-
-
-