Patent search ap:("SambaNova Systems Page Inc.") AND inv:"Yun DU"

1.

发明公开
Graph Spatial Split 审中-公开

公开(公告)号：US20240168915A1

公开(公告)日：2024-05-23

申请号：US18202059

申请日：2023-05-25

Applicant: SambaNova Systems, Inc.

Inventor： Yun DU , Gao DENG , Jianding LUO , Zhengyu CHEN

IPC: G06F15/82 , G06F9/38

CPC classification number: G06F15/825 , G06F9/3867

Abstract: A method for reducing latency and increasing throughput in a reconfigurable computing system includes receiving a compute graph for execution on a reconfigurable dataflow processor comprising a grid of compute units and grid of memory units interconnected with a switching array. The compute graph includes a node specifying an operation on a tensor. The node may be split into multiple nodes that each specify the operation on a distinctive portion of the tensor to produce a first modified compute graph. The first modified compute graph may be executed. In addition, the multiple nodes may be within a single meta-pipeline stage and may be processed in parallel. Furthermore, the compute graph may further comprise a separate node for gathering the distinctive portions of the tensor into a complete tensor, to produce a second modified compute graph.

2.

发明公开
Low Latency Nodes Fusion in a Reconfigurable Data Processor 审中-公开

公开(公告)号：US20230385231A1

公开(公告)日：2023-11-30

申请号：US18199572

申请日：2023-05-19

Applicant: SambaNova Systems, Inc.

Inventor： Yun DU , Jianding LUO

IPC: G06F15/78 , G06F8/41

CPC classification number: G06F15/7878 , G06F8/433

Abstract: A data processing system includes an array of reconfigurable units and a compiler configured to generate a pipeline of n computational nodes related to a dataflow graph, interleaved between n+1 buffers on the array of reconfigurable units. Each computational node is coupled to perform calculations based on data received from an immediately preceding buffer of the n+1 buffers and store results of the calculations into an immediately following buffer of the n+1 buffers after a latency. The compiler is further configured to remove a buffer of the n+1 buffers from the pipeline based on a comparison of the latencies of the computational nodes. A corresponding method is also disclosed herein.

3.

发明公开
Bandwidth-Aware Computational Graph Mapping 审中-公开

公开(公告)号：US20230297349A1

公开(公告)日：2023-09-21

申请号：US18121766

申请日：2023-03-15

Applicant: SambaNova Systems, Inc.

Inventor： Gao DENG , Weihang FAN , Fei WANG , Yun DU

IPC: G06F8/41

CPC classification number: G06F8/433

Abstract: A computer-implemented method of transforming a high-level program for mapping onto a coarse-grained reconfigurable (CGR) processor with an array of CGR units, including sectioning a dataflow graph into a plurality of sections; extracting performance information for each of the plurality of sections; on a CGR unit: assigning to a section at least two computations dependent on a first data element; scheduling an additional load of the first data element in response to available memory bandwidth for that section; eliminating a buffer between the additional load of the first data element and one of the two computations, for that section; generating configuration data for the and communication channels, wherein the configuration data, when loaded onto an instance of the array of CGR units, causes the array of CGR units to implement the dataflow graph; and storing the configuration data in a non-transitory computer-readable storage medium.

Patent Agency Ranking