Patent search ap:("Intel Corporation") AND inv:"Fabrizio Petrini" Page 1

1.

发明授权
Apparatuses, methods, and systems for operations in a configurable spatial accelerator 有权

公开(公告)号：US11200186B2

公开(公告)日：2021-12-14

申请号：US16024854

申请日：2018-06-30

Applicant: Intel Corporation

Inventor： Kermin E. Fleming, Jr. , Simon C. Steely, Jr. , Kent D. Glossop , Mitchell Diamond , Benjamin Keen , Dennis Bradford , Fabrizio Petrini , Barry Tannenbaum , Yongzhi Zhang

IPC: G06F13/40 , G06F9/30 , G06F15/78

Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.

2.

发明授权
Apparatuses, methods, and systems for operations in a configurable spatial accelerator 有权

公开(公告)号：US11593295B2

公开(公告)日：2023-02-28

申请号：US17550875

申请日：2021-12-14

Applicant: Intel Corporation

Inventor： Kermin E. Fleming, Jr. , Simon C. Steely, Jr. , Kent D. Glossop , Mitchell Diamond , Benjamin Keen , Dennis Bradford , Fabrizio Petrini , Barry Tannenbaum , Yongzhi Zhang

IPC: G06F13/40 , G06F9/30 , G06F15/78

Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.

3.

发明申请
MULTITHREADED PROCESSOR CORE WITH HARDWARE-ASSISTED TASK SCHEDULING 审中-公开

公开(公告)号：US20200004587A1

公开(公告)日：2020-01-02

申请号：US16024343

申请日：2018-06-29

Applicant: Intel Corporation

Inventor： Paul Griffin , Joshua Fryman , Jason Howard , Sang Phill Park , Robert Pawlowski , Michael Abbott , Scott Cline , Samkit Jain , Ankit More , Vincent Cave , Fabrizio Petrini , Ivan Ganev

IPC: G06F9/48 , G06F9/38 , G06F9/30

Abstract: Embodiments of apparatuses, methods, and systems for a multithreaded processor core with hardware-assisted task scheduling are described. In an embodiment, a processor includes a first hardware thread, a second hardware thread, and a task manager. The task manager is to issue a task to the first hardware thread. The task manager includes a hardware task queue in which to store a plurality of task descriptors. Each of the task descriptors is to represent one of a single task, a collection of iterative tasks, and a linked list of tasks.

4.

发明申请
ARRAY BROADCAST AND REDUCTION SYSTEMS AND METHODS 审中-公开

公开(公告)号：US20200310795A1

公开(公告)日：2020-10-01

申请号：US16369846

申请日：2019-03-29

Applicant: INTEL CORPORATION

Inventor： Joshua Fryman , Ankit More , Jason Howard , Robert Pawlowski , Yigit Demir , Nick Pepperling , Fabrizio Petrini , Sriram Aananthakrishnan , Shaden Smith

IPC: G06F9/30 , G06F9/32 , G06F9/455

Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.

5.

发明申请
STRUCTURES AND OPERATIONS OF INTEGRATED CIRCUITS HAVING NETWORK OF CONFIGURABLE SWITCHES 审中-公开

公开(公告)号：US20190109590A1

公开(公告)日：2019-04-11

申请号：US16201915

申请日：2018-11-27

Applicant: Intel Corporation

Inventor： Ankit More , Jason M. Howard , Robert Pawlowski , Fabrizio Petrini , Shaden Smith

IPC: H03K17/00 , H03K19/173 , G11C7/10

CPC classification number: H03K17/005 , G11C7/1006 , H03K17/007 , H03K19/1733

Abstract: Embodiments herein may present an integrated circuit including a switch, where the switch together with other switches forms a network of switches to perform a sequence of operations according to a structure of a collective tree. The switch includes a first number of input ports, a second number of output ports, a configurable crossbar to selectively couple the first number of input ports to the second number of output ports, and a computation engine coupled to the first number of input ports, the second number of output ports, and the crossbar. The computation engine of the switch performs an operation corresponding to an operation represented by a node of the collective tree. The switch further includes one or more registers to selectively configure the first number of input ports and the configurable crossbar. Other embodiments may be described and/or claimed.

6.

发明公开
PROGRAM EXECUTION STRATEGIES FOR HETEROGENEOUS COMPUTING SYSTEMS 审中-公开

公开(公告)号：US20230367640A1

公开(公告)日：2023-11-16

申请号：US18030057

申请日：2021-04-23

Applicant: Intel Corporation

Inventor： Kermin E. ChoFleming, Jr. , Egor A. Kazachkov , Daya Shanker Khudia , Zakhar A. Matveev , Sergey U. Kokljuev , Fabrizio Petrini , Dmitry S. Petrov , Swapna Raj

IPC: G06F9/50 , G06F11/30 , G06F11/34

CPC classification number: G06F9/5044 , G06F11/302 , G06F11/3409 , G06F2209/509 , G06F2201/865

Abstract: An offload analyzer analyzes a program for porting to a heterogenous computing system by identifying code objects for offloading to an accelerator. Runtime metrics generated by executing the program on a host processor unit are provided to an accelerator model that models the performance of the accelerator and generates estimated accelerator metrics for the program. A code object offload selector selects code objects for offloading based on whether estimated accelerated times of the code objects, which comprise estimated accelerator times and offload overhead times, are better than their host processor unit execution times. The code object offload selector selects additional code objects for offloading using a dynamic-programming-like performance estimation approach that performs a bottom-up traversal of a call tree. A heterogeneous version of the program can be generated for execution on the heterogeneous computing system.

7.

发明公开
METHODS AND APPARATUS TO ACCELERATE MATRIX OPERATIONS USING DIRECT MEMORY ACCESS 审中-公开

公开(公告)号：US20230325185A1

公开(公告)日：2023-10-12

申请号：US18194252

申请日：2023-03-31

Applicant: Intel Corporation

Inventor： Jesmin Jahan Tithi , Fabio Checconi , Ahmed Helal , Fabrizio Petrini

IPC: G06F9/30 , G06F12/08

CPC classification number: G06F9/3001 , G06F12/08 , G06F2213/28

Abstract: Systems, apparatus, articles of manufacture, and methods are disclosed for performance of sparse matrix time dense matrix operations. Example instructions cause programmable circuitry to control execution of the sparse matrix times dense matrix operation using a sparse matrix and a dense matrix stored in memory, and transmit a plurality of instructions to execute the sparse matrix times dense matrix operation to DMA engine circuitry, the plurality of instructions to cause DMA engine circuitry to create an output matrix in the memory, the creation of the output matrix in the memory performed without the programmable circuitry computing the output matrix.

8.

发明授权
Array broadcast and reduction systems and methods 有权

公开(公告)号：US10983793B2

公开(公告)日：2021-04-20

申请号：US16369846

申请日：2019-03-29

Applicant: INTEL CORPORATION

Inventor： Joshua Fryman , Ankit More , Jason Howard , Robert Pawlowski , Yigit Demir , Nick Pepperling , Fabrizio Petrini , Sriram Aananthakrishnan , Shaden Smith

IPC: G06F9/30 , G06F13/28 , G06F9/32 , G06F9/455

Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.

9.

发明授权
Techniques for acceleration of a prefix-scan operation 有权

公开(公告)号：US12153932B2

公开(公告)日：2024-11-26

申请号：US17129555

申请日：2020-12-21

Applicant: Intel Corporation

Inventor： Ankit More , Fabrizio Petrini , Robert Pawlowski , Shruti Sharma , Sowmya Pitchaimoorthy

IPC: G06F9/4401 , G06F13/40

Abstract: Examples include techniques for an in-network acceleration of a parallel prefix-scan operation. Examples include configuring registers of a node included in a plurality of nodes on a same semiconductor package. The registers to be configured responsive to receiving an instruction that indicates a logical tree to map to a network topology that includes the node. The instruction associated with a prefix-scan operation to be executed by at least a portion of the plurality of nodes.

10.

发明授权
Storage architectures for graph analysis applications 有权

公开(公告)号：US11526483B2

公开(公告)日：2022-12-13

申请号：US15941262

申请日：2018-03-30

Applicant: Intel Corporation

Inventor： Stijn Eyerman , Jason M. Howard , Ibrahim Hur , Ivan B. Ganev , Fabrizio Petrini , Joshua B. Fryman

IPC: G06F16/00 , G06F16/22 , G06F16/901

Abstract: Methods, apparatus, systems and articles of manufacture to build a storage architecture for graph data are disclosed herein. Disclosed example apparatus include a neighbor identifier to identify respective sets of neighboring vertices of a graph. The neighboring vertices included in the respective sets are adjacent to respective ones of a plurality of vertices of the graph and respective sets of neighboring vertices are represented as respective lists of neighboring vertex identifiers. The apparatus also includes an element creator to create, in a cache memory, an array of elements that are unpopulated. The array elements have lengths equal to a length of a cache line. In addition, the apparatus includes an element populater to populate the elements with neighboring vertex identifiers. Each of the elements store neighboring vertex identifiers of respective ones of the list of neighboring vertex identifiers.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification