Patent search ap:("SambaNova Systems Page Inc.") AND inv:"David Alan Koeplinger"

1.

发明授权
Matrix normal/transpose read and a reconfigurable data processor including same 有权

公开(公告)号：US10768899B2

公开(公告)日：2020-09-08

申请号：US16260548

申请日：2019-01-29

Applicant: SambaNova Systems, Inc.

Inventor： David Alan Koeplinger , Raghu Prabhakar , Ram Sivaramakrishnan , David Brian Jackson , Mark Luttrell

IPC: G06F7/78 , G06F5/08 , G06F7/76 , G06F12/02

Abstract: A configurable circuit configurable according to the data width of elements of a matrix is described that includes a memory array, logic to write a matrix to the memory array having elements with a data width which can be specified using configuration data, logic for a transpose read of the matrix as-written and logic for normal read of the matrix as-written. The memory array includes first and second read ports operable in parallel. Transpose read logic and normal read logic can be coupled to the first and second read ports, respectively, allowing transpose and normal read of a matrix simultaneously.

2.

发明授权
Compile time logic for inserting a buffer between a producer operation unit and a consumer operation unit in a dataflow graph 有权

公开(公告)号：US12105630B2

公开(公告)日：2024-10-01

申请号：US17582421

申请日：2022-01-24

Applicant: SambaNova Systems, Inc.

Inventor： Kevin James Brown , David Alan Koeplinger , Weiwei Chen , Xiaoming Gu

IPC: G06F12/0842 , G06F8/41 , G06F15/78 , G06F5/10 , G06F9/448 , G06F11/30

CPC classification number: G06F12/0842 , G06F8/447 , G06F8/457 , G06F15/7892 , G06F5/10 , G06F11/3072 , G06F2205/123 , G06F2212/1016 , G06F2212/45

Abstract: A dataflow graph for an application has operation units that are configured to be producers and consumers of tensors. A write access pattern of a particular producer specifies an order in which the particular producer generates elements of a tensor, and a read access pattern of a corresponding consumer specifies an order in which the corresponding consumer processes the elements of the tensor. The technology disclosed detects conflicts between the producers and the corresponding consumers that have mismatches between the write access patterns and the read access patterns. A conflict occurs when the order in which the particular producer generates the elements of the tensor is different from the order in which the corresponding consumer processes the elements of the tensor. The technology disclosed resolves the conflicts by inserting buffers between the producers and the corresponding consumers.

3.

发明授权
Compile time logic for detecting streaming compatible and broadcast compatible data access patterns 有权

公开(公告)号：US11237971B1

公开(公告)日：2022-02-01

申请号：US17023015

申请日：2020-09-16

Applicant: SambaNova Systems, Inc.

Inventor： Kevin James Brown , David Alan Koeplinger , Weiwei Chen , Xiaoming Gu

IPC: G06F12/00 , G06F13/00 , G06F13/28 , G06F12/0842 , G06F5/10 , G06F11/30

Abstract: A dataflow graph for an application has operation units that are configured to be producers and consumers of tensors. A write access pattern of a particular producer specifies an order in which the particular producer generates elements of a tensor, and a read access pattern of a corresponding consumer specifies an order in which the corresponding consumer processes the elements of the tensor. The technology disclosed detects conflicts between the producers and the corresponding consumers that have mismatches between the write access patterns and the read access patterns. A conflict occurs when the order in which the particular producer generates the elements of the tensor is different from the order in which the corresponding consumer processes the elements of the tensor. The technology disclosed resolves the conflicts by inserting buffers between the producers and the corresponding consumers.

4.

发明授权
Compiler flow logic for reconfigurable architectures 有权

公开(公告)号：US11714780B2

公开(公告)日：2023-08-01

申请号：US17326128

申请日：2021-05-20

Applicant: SambaNova Systems, Inc.

Inventor： David Alan Koeplinger , Raghu Prabhakar , Sumti Jairath

IPC: G06F15/78 , G06F16/901 , G06F8/41 , G06F12/02

CPC classification number: G06F15/7871 , G06F12/023 , G06F16/9024

Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units. It then places the physical memory units and the physical compute units onto positions in the array of configurable units and routes data and control networks between the placed positions.

5.

发明授权
Compiler flow logic for reconfigurable architectures 有权

公开(公告)号：US11080227B2

公开(公告)日：2021-08-03

申请号：US16536192

申请日：2019-08-08

Applicant: SambaNova Systems, Inc.

Inventor： David Alan Koeplinger , Raghu Prabhakar , Sumti Jairath

IPC: G06F15/78 , G06F16/901 , G06F12/02

Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units. It then places the physical memory units and the physical compute units onto positions in the array of configurable units and routes data and control networks between the placed positions.

6.

发明授权
Buffer splitting 有权

公开(公告)号：US12164463B2

公开(公告)日：2024-12-10

申请号：US18130667

申请日：2023-04-04

Applicant: SambaNova Systems, Inc.

Inventor： David Alan Koeplinger , Weihang Fan

IPC: G06F15/80 , G06F15/82

Abstract: A method in a reconfigurable computing system includes receiving a user program for execution on a reconfigurable dataflow computing system, comprising a grid of compute units and grid of memory units interconnected with a switching array. The user program includes multiple tensor-based algebraic expressions that are converted to an intermediate representation comprising one or more logical operations executable via dataflow through compute units. These one or more logical operations are preceded by or followed by a buffer, each buffer corresponding to one or more memory units. The method includes determining whether splitting a selected buffer yields a reduced cost and then splitting the selected buffer, in response to the determining step, to produce first and second buffers. Dataflow through memory units corresponding to the first and second buffers is controlled by one or more memory units within the grid of memory units. Buffer splitting optimization reduces memory unit consumption.

7.

发明授权
Anti-congestion flow control for reconfigurable processors 有权

公开(公告)号：US11709664B2

公开(公告)日：2023-07-25

申请号：US16890841

申请日：2020-06-02

Applicant: SambaNova Systems, Inc.

Inventor： Weiwei Chen , Raghu Prabhakar , David Alan Koeplinger , Sitanshu Gupta , Ruddhi Arun Chaphekar , Ajit Punj , Sumti Jairath

IPC: G06F8/41 , G06F15/78 , G06F15/82

CPC classification number: G06F8/452 , G06F8/41 , G06F15/7867 , G06F15/825

Abstract: A compiler configured to configure memory nodes with a ready-to-read credit counter and a write credit counter. The ready-to-read credit counter of a particular upstream memory node initialized with as many read credits as a buffer depth of a corresponding downstream memory node. The ready-to-read credit counter configured to decrement when a buffer data unit is written by the particular upstream memory node into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a read ready token. The write credit counter of the particular upstream memory node initialized with one or more write credits and configured to decrement when the particular upstream memory node begins writing the buffer data unit into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a write done token.

8.

发明授权
Runtime patching of configuration files 有权

公开(公告)号：US11782729B2

公开(公告)日：2023-10-10

申请号：US16996666

申请日：2020-08-18

Applicant: SambaNova Systems, Inc.

Inventor： Gregory Frederick Grohoski , Manish K. Shah , Raghu Prabhakar , Mark Luttrell , Ravinder Kumar , Kin Hing Leung , Ranen Chatterjee , Sumti Jairath , David Alan Koeplinger , Ram Sivaramakrishnan , Matthew Thomas Grimm

IPC: G06F9/44 , G06F9/445 , G06F9/50

CPC classification number: G06F9/44505 , G06F9/5016

Abstract: A data processing system comprises a pool of reconfigurable data flow resources and a runtime processor. The pool of reconfigurable data flow resources includes arrays of physical configurable units and memory. The runtime processor includes logic to receive a plurality of configuration files for user applications. The configuration files include configurations of virtual data flow resources required to execute the user applications. The runtime processor also includes logic to allocate physical configurable units and memory in the pool of reconfigurable data flow resources to the virtual data flow resources and load the configuration files to the allocated physical configurable units. The runtime processor further includes logic to execute the user applications using the allocated physical configurable units and memory.

9.

发明授权
Systems and methods for memory layout determination and conflict resolution 有权

公开(公告)号：US11645057B2

公开(公告)日：2023-05-09

申请号：US17031679

申请日：2020-09-24

Applicant: SambaNova Systems, Inc.

Inventor： David Alan Koeplinger , Weiwei Chen , Kevin James Brown , Xiaoming Gu

IPC: G06F8/41 , G06F12/0842

CPC classification number: G06F8/443 , G06F8/433 , G06F12/0842

Abstract: A dataflow graph has operation units that are configured to be producer operation units to produce tensors for execution of the application, and to be consumer operation units to consume the tensors for execution of the application. Compile time logic is configured to process the dataflow graph to determine, for the tensors, expected producer memory layouts, expected consumer memory layouts, and current memory layouts. The expected producer memory layouts specify memory layouts required by the producer operation units that produce the tensors. The expected consumer memory layouts specify the memory layouts required by the consumer operation units that consume the tensors. The current memory layouts specify the memory layouts of the tensors. Each of the memory layouts includes a vector dimension and at least one of a vector ordering and a data alignment.

10.

发明申请
MATRIX NORMAL/TRANSPOSE READ AND A RECONFIGURABLE DATA PROCESSOR INCLUDING SAME 审中-公开

公开(公告)号：US20200241844A1

公开(公告)日：2020-07-30

申请号：US16260548

申请日：2019-01-29

Applicant: SambaNova Systems, Inc.

Inventor： David Alan Koeplinger , Raghu Prabhakar , Ram Sivaramakrishnan , David Brian Jackson , Mark Luttrell

IPC: G06F7/78 , G06F12/02 , G06F7/76 , G06F5/08

Abstract: A configurable circuit configurable according to the data width of elements of a matrix is described that includes a memory array, logic to write a matrix to the memory array having elements with a data width which can be specified using configuration data, logic for a transpose read of the matrix as-written and logic for normal read of the matrix as-written. The memory array includes first and second read ports operable in parallel. Transpose read logic and normal read logic can be coupled to the first and second read ports, respectively, allowing transpose and normal read of a matrix simultaneously.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification