Graph Spatial Split
    1.
    发明公开
    Graph Spatial Split 审中-公开

    公开(公告)号:US20240168915A1

    公开(公告)日:2024-05-23

    申请号:US18202059

    申请日:2023-05-25

    CPC classification number: G06F15/825 G06F9/3867

    Abstract: A method for reducing latency and increasing throughput in a reconfigurable computing system includes receiving a compute graph for execution on a reconfigurable dataflow processor comprising a grid of compute units and grid of memory units interconnected with a switching array. The compute graph includes a node specifying an operation on a tensor. The node may be split into multiple nodes that each specify the operation on a distinctive portion of the tensor to produce a first modified compute graph. The first modified compute graph may be executed. In addition, the multiple nodes may be within a single meta-pipeline stage and may be processed in parallel. Furthermore, the compute graph may further comprise a separate node for gathering the distinctive portions of the tensor into a complete tensor, to produce a second modified compute graph.

    Low Latency Nodes Fusion in a Reconfigurable Data Processor

    公开(公告)号:US20230385231A1

    公开(公告)日:2023-11-30

    申请号:US18199572

    申请日:2023-05-19

    Inventor: Yun DU Jianding LUO

    CPC classification number: G06F15/7878 G06F8/433

    Abstract: A data processing system includes an array of reconfigurable units and a compiler configured to generate a pipeline of n computational nodes related to a dataflow graph, interleaved between n+1 buffers on the array of reconfigurable units. Each computational node is coupled to perform calculations based on data received from an immediately preceding buffer of the n+1 buffers and store results of the calculations into an immediately following buffer of the n+1 buffers after a latency. The compiler is further configured to remove a buffer of the n+1 buffers from the pipeline based on a comparison of the latencies of the computational nodes. A corresponding method is also disclosed herein.

    Bandwidth-Aware Computational Graph Mapping
    3.
    发明公开

    公开(公告)号:US20230297349A1

    公开(公告)日:2023-09-21

    申请号:US18121766

    申请日:2023-03-15

    CPC classification number: G06F8/433

    Abstract: A computer-implemented method of transforming a high-level program for mapping onto a coarse-grained reconfigurable (CGR) processor with an array of CGR units, including sectioning a dataflow graph into a plurality of sections; extracting performance information for each of the plurality of sections; on a CGR unit: assigning to a section at least two computations dependent on a first data element; scheduling an additional load of the first data element in response to available memory bandwidth for that section; eliminating a buffer between the additional load of the first data element and one of the two computations, for that section; generating configuration data for the and communication channels, wherein the configuration data, when loaded onto an instance of the array of CGR units, causes the array of CGR units to implement the dataflow graph; and storing the configuration data in a non-transitory computer-readable storage medium.

Patent Agency Ranking