-
公开(公告)号:US20240241645A1
公开(公告)日:2024-07-18
申请号:US18621437
申请日:2024-03-29
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Shruti Sharma , Fabio Checconi , Sriram Aananthakrishnan , Jesmin Jahan Tithi , Jordi Wolfson-Pou , Joshua B. Fryman
IPC: G06F3/06
CPC classification number: G06F3/0613 , G06F3/0656 , G06F3/0673
Abstract: Systems, apparatuses and methods may provide for technology that includes a plurality of hash management buffers corresponding to a plurality of pipelines, wherein each hash management buffer in the plurality of hash management buffers is adjacent to a pipeline in the plurality of pipelines, and wherein a first hash management buffer is to issue one or more hash packets associated with one or more hash operations on a hash table. The technology may also include a plurality of hash engines corresponding to a plurality of dynamic random access memories (DRAMs), wherein each hash engine in the plurality of hash engines is adjacent to a DRAM in the plurality of DRAMs, and wherein one or more of the hash engines is to initialize a target memory destination associated with the hash table and conduct the one or more hash operations in response to the one or more hash packets.
-
2.
公开(公告)号:US20240119015A1
公开(公告)日:2024-04-11
申请号:US18458462
申请日:2023-08-30
Applicant: Intel Corporation
Inventor: Shruti Sharma , Robert Pawlowski
CPC classification number: G06F13/1673 , G06F9/526
Abstract: Systems, apparatuses and methods may provide for technology that detects a condition in which a plurality of atomic instructions target a common address and different bit positions in a mask, generates a combined read-lock request for the plurality of atomic instructions in response to the condition, and sends the combined read-lock request to a lock buffer coupled to a memory device associated with the common address.
-
3.
公开(公告)号:US20240020253A1
公开(公告)日:2024-01-18
申请号:US18477787
申请日:2023-09-29
Applicant: Intel Corporation
Inventor: Shruti Sharma , Robert Pawlowski , Fabio Checconi , Jesmin Jahan Tithi
IPC: G06F13/28
CPC classification number: G06F13/28 , G06F2213/28
Abstract: Systems, apparatuses and methods may provide for technology that detects a plurality of sub-instruction requests from a first memory engine in a plurality of memory engines, wherein the plurality of sub-instruction requests are associated with a direct memory access (DMA) data type conversion request from a first pipeline, wherein each sub-instruction request corresponds to a data element in the DMA data type conversion request, and wherein the first memory engine is to correspond to the first pipeline, decodes the plurality of sub-instruction requests to identify one or more arguments, loads a source array from a dynamic random access memory (DRAM) in a plurality of DRAMs, wherein the operation engine is to correspond to the DRAM, and conducts a conversion of the source array from a first data type to a second data type in accordance with the one or more arguments.
-
公开(公告)号:US12158852B2
公开(公告)日:2024-12-03
申请号:US17358832
申请日:2021-06-25
Applicant: Intel Corporation
Inventor: Robert Pawlowski , Bharadwaj Krishnamurthy , Shruti Sharma , Byoungchan Oh , Jing Fang , Daniel Klowden , Jason Howard , Joshua Fryman
IPC: G06F13/28
Abstract: Systems, methods, and apparatuses for direct memory access instruction set architecture support for flexible dense compute using a reconfigurable spatial array are described. In one embodiment, a processor includes a first type of hardware processor core that includes a two-dimensional grid of compute circuits, a memory, and a direct memory access circuit coupled to the memory and the two-dimensional grid of compute circuits; and a second different type of hardware processor core that includes a decoder circuit to decode a single instruction into a decoded single instruction, the single instruction including a first field to identify a base address of two-dimensional data in the memory, a second field to identify a number of elements in each one-dimensional array of the two-dimensional data, a third field to identify a number of one-dimensional arrays of the two-dimensional data, a fourth field to identify an operation to be performed by the two-dimensional grid of compute circuits, and a fifth field to indicate the direct memory access circuit is to move the two-dimensional data indicated by the first field, the second field, and the third field into the two-dimensional grid of compute circuits and the two-dimensional grid of compute circuits is to perform the operation on the two-dimensional data according to the fourth field, and an execution circuit to execute the decoded single instruction according to the fields.
-
公开(公告)号:US12153932B2
公开(公告)日:2024-11-26
申请号:US17129555
申请日:2020-12-21
Applicant: Intel Corporation
Inventor: Ankit More , Fabrizio Petrini , Robert Pawlowski , Shruti Sharma , Sowmya Pitchaimoorthy
IPC: G06F9/4401 , G06F13/40
Abstract: Examples include techniques for an in-network acceleration of a parallel prefix-scan operation. Examples include configuring registers of a node included in a plurality of nodes on a same semiconductor package. The registers to be configured responsive to receiving an instruction that indicates a logical tree to map to a network topology that includes the node. The instruction associated with a prefix-scan operation to be executed by at least a portion of the plurality of nodes.
-
6.
公开(公告)号:US20230333998A1
公开(公告)日:2023-10-19
申请号:US18312752
申请日:2023-05-05
Applicant: Intel Corporation
Inventor: Shruti Sharma , Robert Pawlowski , Fabio Checconi , Jesmin Jahan Tithi
IPC: G06F13/28
CPC classification number: G06F13/28
Abstract: Systems, apparatuses and methods may provide for technology that includes a plurality of memory engines corresponding to a plurality of pipelines, wherein each memory engine in the plurality of memory engines is adjacent to a pipeline in the plurality of pipelines, and wherein a first memory engine is to request one or more direct memory access (DMA) operations associated with a first pipeline, and a plurality of operation engines corresponding to a plurality of dynamic random access memories (DRAMs), wherein each operation engine in the plurality of operation engines is adjacent to a DRAM in the plurality of DRAMs, and wherein one or more of the plurality of operation engines is to conduct the one or more DMA operations based on one or more bitmaps.
-
公开(公告)号:US20240256283A1
公开(公告)日:2024-08-01
申请号:US18566068
申请日:2022-03-31
Applicant: Intel Corporation
Inventor: Joshua B. Fryman , Byoungchan Oh , Sai Dheeraj Polagani , Kevin P. Ma , Robert S. Pawlowski , Bharadwaj Coimbatore Krishnamurthy , Shruti Sharma , Smitha P. Vasantha Kumar , Jason Howard , Daniel S. Klowden
CPC classification number: G06F9/3851 , G06F11/3409
Abstract: A system is provided that includes a set of graph processing cores and a set of dense compute cores. where the set of graph processing cores and the set of dense cores are interconnected in a network. The dense compute cores include offload queue circuitry to receive an offload request from the set of graph processing cores to handle dense compute workloads. Memory controllers are also provided in the system for use by the graph processing cores in reading and writing to memory in association with sparse graph applications. the memory controllers enhanced to efficiently handle memory transactions in sparse graph applications.
-
8.
公开(公告)号:US20230315451A1
公开(公告)日:2023-10-05
申请号:US18326623
申请日:2023-05-31
Applicant: Intel Corporation
Inventor: Shruti Sharma , Robert Pawlowski , Fabio Checconi , Jesmin Jahan Tithi
CPC classification number: G06F9/30043 , G06F9/30079 , G06F13/28
Abstract: Systems, apparatuses and methods may provide for technology that detects, by an operation engine, a plurality of sub-instruction requests from a first memory engine in a plurality of memory engines, wherein the plurality of sub-instruction requests are associated with a direct memory access (DMA) bitmap manipulation request from a first pipeline, wherein each sub-instruction request corresponds to a data element in the DMA bitmap manipulation request, and wherein the first memory engine is to correspond to the first pipeline. The technology also detects, by the operation engine, one or more arguments in the plurality of sub-instruction requests, sends, by the operation engine, one or more load requests to a DRAM in the plurality of DRAMs in accordance with the one or more arguments, and sends, by the operation engine, one or more store requests to the DRAM in accordance with the one or more arguments, wherein the operation engine is to correspond to the DRAM.
-
-
-
-
-
-
-