Compile time logic for detecting streaming compatible and broadcast compatible data access patterns

    公开(公告)号:US11237971B1

    公开(公告)日:2022-02-01

    申请号:US17023015

    申请日:2020-09-16

    Abstract: A dataflow graph for an application has operation units that are configured to be producers and consumers of tensors. A write access pattern of a particular producer specifies an order in which the particular producer generates elements of a tensor, and a read access pattern of a corresponding consumer specifies an order in which the corresponding consumer processes the elements of the tensor. The technology disclosed detects conflicts between the producers and the corresponding consumers that have mismatches between the write access patterns and the read access patterns. A conflict occurs when the order in which the particular producer generates the elements of the tensor is different from the order in which the corresponding consumer processes the elements of the tensor. The technology disclosed resolves the conflicts by inserting buffers between the producers and the corresponding consumers.

    Compiler flow logic for reconfigurable architectures

    公开(公告)号:US11714780B2

    公开(公告)日:2023-08-01

    申请号:US17326128

    申请日:2021-05-20

    CPC classification number: G06F15/7871 G06F12/023 G06F16/9024

    Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units. It then places the physical memory units and the physical compute units onto positions in the array of configurable units and routes data and control networks between the placed positions.

    Compiler flow logic for reconfigurable architectures

    公开(公告)号:US11080227B2

    公开(公告)日:2021-08-03

    申请号:US16536192

    申请日:2019-08-08

    Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units. It then places the physical memory units and the physical compute units onto positions in the array of configurable units and routes data and control networks between the placed positions.

    Buffer splitting
    6.
    发明授权

    公开(公告)号:US12164463B2

    公开(公告)日:2024-12-10

    申请号:US18130667

    申请日:2023-04-04

    Abstract: A method in a reconfigurable computing system includes receiving a user program for execution on a reconfigurable dataflow computing system, comprising a grid of compute units and grid of memory units interconnected with a switching array. The user program includes multiple tensor-based algebraic expressions that are converted to an intermediate representation comprising one or more logical operations executable via dataflow through compute units. These one or more logical operations are preceded by or followed by a buffer, each buffer corresponding to one or more memory units. The method includes determining whether splitting a selected buffer yields a reduced cost and then splitting the selected buffer, in response to the determining step, to produce first and second buffers. Dataflow through memory units corresponding to the first and second buffers is controlled by one or more memory units within the grid of memory units. Buffer splitting optimization reduces memory unit consumption.

    Anti-congestion flow control for reconfigurable processors

    公开(公告)号:US11709664B2

    公开(公告)日:2023-07-25

    申请号:US16890841

    申请日:2020-06-02

    CPC classification number: G06F8/452 G06F8/41 G06F15/7867 G06F15/825

    Abstract: A compiler configured to configure memory nodes with a ready-to-read credit counter and a write credit counter. The ready-to-read credit counter of a particular upstream memory node initialized with as many read credits as a buffer depth of a corresponding downstream memory node. The ready-to-read credit counter configured to decrement when a buffer data unit is written by the particular upstream memory node into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a read ready token. The write credit counter of the particular upstream memory node initialized with one or more write credits and configured to decrement when the particular upstream memory node begins writing the buffer data unit into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a write done token.

    Systems and methods for memory layout determination and conflict resolution

    公开(公告)号:US11645057B2

    公开(公告)日:2023-05-09

    申请号:US17031679

    申请日:2020-09-24

    CPC classification number: G06F8/443 G06F8/433 G06F12/0842

    Abstract: A dataflow graph has operation units that are configured to be producer operation units to produce tensors for execution of the application, and to be consumer operation units to consume the tensors for execution of the application. Compile time logic is configured to process the dataflow graph to determine, for the tensors, expected producer memory layouts, expected consumer memory layouts, and current memory layouts. The expected producer memory layouts specify memory layouts required by the producer operation units that produce the tensors. The expected consumer memory layouts specify the memory layouts required by the consumer operation units that consume the tensors. The current memory layouts specify the memory layouts of the tensors. Each of the memory layouts includes a vector dimension and at least one of a vector ordering and a data alignment.

Patent Agency Ranking