METHOD AND APPARATUS FOR GNN-ACCELERATION FOR EFFICIENT PARALLEL PROCESSING OF MASSIVE DATASETS

    公开(公告)号:US20230418673A1

    公开(公告)日:2023-12-28

    申请号:US18166685

    申请日:2023-02-09

    CPC classification number: G06F9/5027 G06F9/54 G06F9/4881

    Abstract: Provided is an apparatus for accelerating a graph neural network for efficient parallel processing of massive graph datasets, including a streaming multiprocess (SM) scheduler and a computation unit, wherein the SM scheduler obtains a subgraph and an embedding table per layer, determines a number of SMs to be allocated for processing embeddings of a destination-vertex based on a feature dimension and a maximum number of threads in each of the SMs, and allocates the determined number of SMs to each of all destination-vertices included in the subgraph, and the computation unit obtains, by each of the SMs, embeddings of a destination-vertex allocated to each SM, obtains, by each SM, embeddings of at least one or more neighbor-vertices of the destination-vertex using the subgraph, and performs, by each SM, a user-designated operation using the embeddings of the destination-vertex and the embeddings of the neighbor-vertices.

    METHOD AND APPARATUS FOR ACCELERATING GNN PRE-PROCESSING

    公开(公告)号:US20240303122A1

    公开(公告)日:2024-09-12

    申请号:US18453702

    申请日:2023-08-22

    CPC classification number: G06F9/5027 G06F7/36

    Abstract: Provided is an apparatus for accelerating graph neural network (GNN) pre-processing, the apparatus including a set-partitioning accelerator configured to sort each edge of an original graph stored in a coordinate list (COO) format by a node number, perform radix sorting based on a vertex identification (VID) to generate a COO array of a preset length, and perform uniform random sampling on some nodes of a given node array, a merger configured to merge the COO array of the preset length to generate one sorted COO array, a re-indexer configured to assign new consecutive VIDs respectively to the nodes selected through the uniform random sampling, and a compressed sparse row (CSR) converter configured to the edges sorted by the node number into a CSR format.

    HYBRID MEMORY SYSTEM AND ACCELERATOR INCLUDING THE SAME

    公开(公告)号:US20240045588A1

    公开(公告)日:2024-02-08

    申请号:US18090645

    申请日:2022-12-29

    CPC classification number: G06F3/0604 G06F3/0647 G06F3/0679

    Abstract: An accelerator includes a processor and a hybrid memory system. The hybrid memory system includes a resistance-based non-volatile memory, a DRAM used as a cache of the resistance-based non-volatile memory, a non-volatile memory controller connected to the resistance-based non-volatile memory and configured to control the DRAM and the resistance-based non-volatile memory, a memory controller configured to process a memory request from the processor and control the DRAM, and a memory channel configured to connect the DRAM, the non-volatile memory controller, and the memory controller.

Patent Agency Ranking