-
公开(公告)号:US12135981B2
公开(公告)日:2024-11-05
申请号:US18207870
申请日:2023-06-09
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11416281B2
公开(公告)日:2022-08-16
申请号:US16474978
申请日:2016-12-31
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
3.
公开(公告)号:US20200310817A1
公开(公告)日:2020-10-01
申请号:US16364704
申请日:2019-03-26
Applicant: Intel Corporation
Inventor: Jeffrey J. Cook , Srikanth T. Srinivasan , Jonathan D. Pearce , David B. Sheffield
Abstract: In one embodiment, an apparatus includes: a plurality of execution lanes to perform parallel execution of instructions; and a unified symbolic store address buffer coupled to the plurality of execution lanes, the unified symbolic store address buffer comprising a plurality of entries each to store a symbolic store address for a store instruction to be executed by at least some of the plurality of execution lanes. Other embodiments are described and claimed.
-
4.
公开(公告)号:US11188341B2
公开(公告)日:2021-11-30
申请号:US16364704
申请日:2019-03-26
Applicant: Intel Corporation
Inventor: Jeffrey J. Cook , Srikanth T. Srinivasan , Jonathan D. Pearce , David B. Sheffield
Abstract: In one embodiment, an apparatus includes: a plurality of execution lanes to perform parallel execution of instructions; and a unified symbolic store address buffer coupled to the plurality of execution lanes, the unified symbolic store address buffer comprising a plurality of entries each to store a symbolic store address for a store instruction to be executed by at least some of the plurality of execution lanes. Other embodiments are described and claimed.
-
公开(公告)号:US20170090924A1
公开(公告)日:2017-03-30
申请号:US14866921
申请日:2015-09-26
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Edward T. Grochowski , Jonathan D. Pearce , Deborah T. Marr , Ehud Cohen , Elmoustapha OuId-Ahmed-Vall , Jesus Corbal San Adrian , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Milind B. Girkar
IPC: G06F9/30
Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.
-
公开(公告)号:US11886875B2
公开(公告)日:2024-01-30
申请号:US16232599
申请日:2018-12-26
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Jonathan D. Pearce , Dan Baum , Guei-Yuan Lueh , Michael Espig , Christopher J. Hughes , Raanan Sade , Robert Valentine , Mark J. Charney , Alexander F. Heinecke
IPC: G06F9/30
CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30018 , G06F9/30038
Abstract: Disclosed embodiments relate to systems and methods for performing nibble-sized operations on matrix elements. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction the fetched instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode to indicate the processor is to, for each pair of corresponding elements of the first and second source matrices, logically partition each element into nibble-sized partitions, perform an operation indicated by the instruction on each partition, and store execution results to a corresponding nibble-sized partition of a corresponding element of the destination matrix. The exemplary processor includes execution circuitry to execute the decoded instruction as per the opcode.
-
公开(公告)号:US11416411B2
公开(公告)日:2022-08-16
申请号:US16354859
申请日:2019-03-15
Applicant: Intel Corporation
Inventor: Murali Ramadoss , Vikranth Vemulapalli , Niran Cooray , William B. Sadler , Jonathan D. Pearce , Marian Alin Petre , Ben Ashbaugh , Elmoustapha Ould-Ahmed-Vall , Nicolas Galoppo Von Borries , Altug Koker , Aravindh Anantaraman , Subramaniam Maiyuran , Varghese George , Sungye Kim , Valentin Andrei
IPC: G06F12/1009 , G06N20/00 , G06T1/20
Abstract: Methods and apparatus relating to predictive page fault handling. In an example, an apparatus comprises a processor to receive a virtual address that triggered a page fault for a compute process, check a virtual memory space for a virtual memory allocation for the compute process that triggered the page fault and manage the page fault according to one of a first protocol in response to a determination that the virtual address that triggered the page fault is a last page in the virtual memory allocation for the compute process, or a second protocol in response to a determination that the virtual address that triggered the page fault is not a last page in the virtual memory allocation for the compute process. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11113053B2
公开(公告)日:2021-09-07
申请号:US16579394
申请日:2019-09-23
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Edward T. Grochowski , Jonathan D. Pearce , Deborah T. Marr , Ehud Cohen , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal San Adrian , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Milind B. Girkar
IPC: G06F9/30
Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.
-
公开(公告)号:US20200310815A1
公开(公告)日:2020-10-01
申请号:US16364688
申请日:2019-03-26
Applicant: Intel Corporation
Inventor: Andrey Ayupov , Srikanth T. Srinivasan , Jonathan D. Pearce , David B. Sheffield
Abstract: In one embodiment, an apparatus includes: a plurality of registers; a first instruction queue to store first instructions; a second instruction queue to store second instructions; a program order queue having a plurality of portions each associated with one of the plurality of registers, each of the portions having entries to store a state of an instruction, the state comprising an encoding of a use of the register by the instruction and a source instruction queue for the instruction; and a dispatcher to dispatch for execution the first and second instructions from the first and second instruction queues based at least in part on information stored in the program order queue, to manage instruction dependencies between the first instructions and the second instructions. Other embodiments are described and claimed.
-
公开(公告)号:US11243775B2
公开(公告)日:2022-02-08
申请号:US16364688
申请日:2019-03-26
Applicant: Intel Corporation
Inventor: Andrey Ayupov , Srikanth T. Srinivasan , Jonathan D. Pearce , David B. Sheffield
IPC: G06F9/38
Abstract: In one embodiment, an apparatus includes: a plurality of registers; a first instruction queue to store first instructions; a second instruction queue to store second instructions; a program order queue having a plurality of portions each associated with one of the plurality of registers, each of the portions having entries to store a state of an instruction, the state comprising an encoding of a use of the register by the instruction and a source instruction queue for the instruction; and a dispatcher to dispatch for execution the first and second instructions from the first and second instruction queues based at least in part on information stored in the program order queue, to manage instruction dependencies between the first instructions and the second instructions. Other embodiments are described and claimed.
-
-
-
-
-
-
-
-
-