-
公开(公告)号:US11443091B1
公开(公告)日:2022-09-13
申请号:US16945006
申请日:2020-07-31
Applicant: Xilinx, Inc.
Inventor: Peter McColgan , Baris Ozgul , David Clarke , Tim Tuan , Juan J. Noguera Serra , Goran H. K. Bilski , Jan Langer , Sneha Bhalchandra Date , Stephan Munz , Jose Marques
IPC: G06F30/343 , G06F9/30 , G06F30/398 , G06F30/33
Abstract: An integrated circuit includes a plurality of data processing engines (DPEs) DPEs. Each DPE may include a core configured to perform computations. A first DPE of the plurality of DPEs includes a first core coupled to an input cascade connection of the first core. The input cascade connection is directly coupled to a plurality of source cores of the plurality of DPEs. The input cascade connection includes a plurality of inputs, wherein each of the plurality of inputs is connected to a cascade output of a different one of the plurality of source cores. The input cascade connection is programmable to enable a selected one of the plurality of inputs.
-
公开(公告)号:US11113223B1
公开(公告)日:2021-09-07
申请号:US15944490
申请日:2018-04-03
Applicant: Xilinx, Inc.
Inventor: Peter McColgan , Goran H K Bilski , Juan J. Noguera Serra , Jan Langer , Baris Ozgul , David Clarke
Abstract: Examples herein describe techniques for communicating between data processing engines in an array of data processing engines. In one embodiment, the array is a 2D array where each of the DPEs includes one or more cores. In addition to the cores, the data processing engines can include streaming interconnects which transmit streaming data using two different modes: circuit switching and packet switching. Circuit switching establishes reserved point-to-point communication paths between endpoints in the interconnect which routes data in a deterministic manner. Packet switching, in contrast, transmits streaming data that includes headers for routing data within the interconnect in a non-deterministic manner. In one embodiment, the streaming interconnects can have one or more ports configured to perform circuit switching and one or more ports configured to perform packet switching.
-
公开(公告)号:US20230053537A1
公开(公告)日:2023-02-23
申请号:US17819879
申请日:2022-08-15
Applicant: Xilinx, Inc.
Inventor: Baris Ozgul , David Clarke , Peter McColgan , Stephan Munz , Dylan Stuart , Pedro Miguel Parola Duarte , Juan J. Noguera Serra
Abstract: Using multiple overlays with a data processing array includes loading an application in a data processing array. The data processing array includes a plurality of compute tiles each having a processor. The application specifies kernels executable by the processors and implements stream channels that convey data to the plurality of compute tiles. During runtime of the application, a plurality of overlays are sequentially implemented in the data processing array. Each overlay implements a different mode of data movement in the data processing array via the stream channels. For each overlay implemented, a workload is performed by moving data to the plurality of compute tiles based on the respective mode of data movement.
-
公开(公告)号:US11296707B1
公开(公告)日:2022-04-05
申请号:US17196574
申请日:2021-03-09
Applicant: Xilinx, Inc.
Inventor: Javier Cabezas Rodriguez , Juan J. Noguera Serra , David Clarke , Sneha Bhalchandra Date , Tim Tuan , Peter McColgan , Jan Langer , Baris Ozgul
IPC: H03K19/1776 , H03K19/17704 , H03K19/17768 , H03K19/17758 , H03K19/17796
Abstract: An integrated circuit can include a data processing engine (DPE) array having a plurality of tiles. The plurality of tiles can include a plurality of DPE tiles, wherein each DPE tile includes a stream switch, a core configured to perform operations, and a memory module. The plurality of tiles can include a plurality of memory tiles, wherein each memory tile includes a stream switch, a direct memory access (DMA) engine, and a random-access memory. The DMA engine of each memory tile may be configured to access the random-access memory within the same memory tile and the random-access memory of at least one other memory tile. Selected ones of the plurality of DPE tiles may be configured to access selected ones of the plurality of memory tiles via the stream switches.
-
公开(公告)号:US12067406B2
公开(公告)日:2024-08-20
申请号:US17819879
申请日:2022-08-15
Applicant: Xilinx, Inc.
Inventor: Baris Ozgul , David Clarke , Peter McColgan , Stephan Münz , Dylan Stuart , Pedro Miguel Parola Duarte , Juan J. Noguera Serra
CPC classification number: G06F9/44505 , G06F9/5083 , G06F13/1673 , G06F13/28 , G06F17/16 , G06N3/063
Abstract: Using multiple overlays with a data processing array includes loading an application in a data processing array. The data processing array includes a plurality of compute tiles each having a processor. The application specifies kernels executable by the processors and implements stream channels that convey data to the plurality of compute tiles. During runtime of the application, a plurality of overlays are sequentially implemented in the data processing array. Each overlay implements a different mode of data movement in the data processing array via the stream channels. For each overlay implemented, a workload is performed by moving data to the plurality of compute tiles based on the respective mode of data movement.
-
6.
公开(公告)号:US20230376437A1
公开(公告)日:2023-11-23
申请号:US17663824
申请日:2022-05-17
Applicant: Xilinx, Inc.
Inventor: David Patrick Clarke , Peter McColgan , Juan J. Noguera Serra , Tim Tuan , Saurabh Mathur , Amarnath Kasibhatla , Javier Cabezas Rodriguez , Pedro Miguel Parola Duarte , Zachary Blaise Dickman
IPC: G06F13/28
CPC classification number: G06F13/28 , G06F2213/28
Abstract: An integrated circuit (IC) can include a data processing array including a plurality of compute tiles arranged in a grid. The IC can include an array interface coupled to the data processing array. The array interface includes a plurality of interface tiles. Each interface tile includes a plurality of direct memory access circuits. The IC can include a network-on-chip (NoC) coupled to the array interface. Each direct memory access circuit is communicatively linked to the NoC via an independent communication channel.
-
公开(公告)号:US11730325B2
公开(公告)日:2023-08-22
申请号:US17468346
申请日:2021-09-07
Applicant: XILINX, INC.
Inventor: Peter McColgan , Goran Hk Bilski , Juan J. Noguera Serra , Jan Langer , Baris Ozgul , David Clarke
CPC classification number: A47K11/02 , E04H1/1216 , E04H15/38 , G06F13/4022 , Y02A50/30
Abstract: Examples herein describe techniques for communicating between data processing engines in an array of data processing engines. In one embodiment, the array is a 2D array where each of the DPEs includes one or more cores. In addition to the cores, the data processing engines can include streaming interconnects which transmit streaming data using two different modes: circuit switching and packet switching. Circuit switching establishes reserved point-to-point communication paths between endpoints in the interconnect which routes data in a deterministic manner. Packet switching, in contrast, transmits streaming data that includes headers for routing data within the interconnect in a non-deterministic manner. In one embodiment, the streaming interconnects can have one or more ports configured to perform circuit switching and one or more ports configured to perform packet switching.
-
公开(公告)号:US11520717B1
公开(公告)日:2022-12-06
申请号:US17196669
申请日:2021-03-09
Applicant: Xilinx, Inc.
Inventor: David Clarke , Peter McColgan , Zachary Dickman , Jose Marques , Juan J. Noguera Serra , Tim Tuan , Baris Ozgul , Jan Langer
Abstract: An integrated circuit having a data processing engine (DPE) array can include a plurality of memory tiles. A first memory tile can include a first direct memory access (DMA) engine, a first random-access memory (RAM) connected to the first DMA engine, and a first stream switch coupled to the first DMA engine. The first DMA engine may be coupled to a second RAM disposed in a second memory tile. The first stream switch may be coupled to a second stream switch disposed in the second memory tile.
-
公开(公告)号:US20240088900A1
公开(公告)日:2024-03-14
申请号:US18509128
申请日:2023-11-14
Applicant: Xilinx, Inc.
Inventor: Juan J. Noguera Serra , Tim Tuan , Javier Cabezas Rodriguez , David Clarke , Peter McColgan , Zachary Blaise Dickman , Saurabh Mathur , Amarnath Kasibhatla , Francisco Barat Quesada
IPC: H03K19/1776 , G11C5/02 , H03K19/17764 , H03K19/17784
CPC classification number: H03K19/1776 , G11C5/025 , H03K19/17764 , H03K19/17784
Abstract: An apparatus includes a data processing array having a plurality of array tiles. The plurality of array tiles include a plurality of compute tiles. The compute tiles include a core coupled to a random-access memory (RAM) in a same compute tile and to a RAM of at least one other compute tile. The data processing array is subdivided into a plurality of partitions. Each partition includes a plurality of array tiles including at least one of the plurality of compute tiles. The apparatus includes a plurality of clock gate circuits being programmable to selectively gate a clock signal provided to a respective one of the plurality of partitions.
-
公开(公告)号:US20230336179A1
公开(公告)日:2023-10-19
申请号:US17659423
申请日:2022-04-15
Applicant: Xilinx, Inc.
Inventor: Juan J. Noguera Serra , Tim Tuan , Javier Cabezas Rodriguez , David Clarke , Peter McColgan , Zachary Blaise Dickman , Saurabh Mathur , Amarnath Kasibhatla , Francisco Barat Quesada
IPC: H03K19/1776 , H03K19/17784 , H03K19/17764 , G11C5/02
CPC classification number: H03K19/1776 , H03K19/17784 , H03K19/17764 , G11C5/025
Abstract: An apparatus includes a data processing array having a plurality of array tiles. Each array tile can include a random-access memory (RAM) having a local memory interface accessible by circuitry within the array tile and an adjacent memory interface accessible by circuitry disposed within an adjacent array tile. Each adjacent memory interface of each array tile can include isolation logic that is programmable to allow the circuitry disposed within the adjacent array tile to access the RAM or prevent the circuitry disposed within the adjacent array tile from accessing the RAM. The data processing array can be subdivided into a plurality of partitions wherein the isolation logic of the adjacent memory interfaces is programmed to prevent array tiles from accessing RAMs across a boundary between the plurality of partitions.
-
-
-
-
-
-
-
-
-