Distributed processing system and distributed processing method

    公开(公告)号:US11823063B2

    公开(公告)日:2023-11-21

    申请号:US16973707

    申请日:2019-05-21

    摘要: Individual distributed processing nodes packetize distributed data for each weight of a neural network of a learning object in an order of a number of the weight, transmit the distributed data to an aggregation processing node, acquire aggregation data transmitted from the node in order, and update the weight of the neural network. The node acquires the transmitted distributed data, packetizes the aggregation data for which the distributed data of all the distributed processing nodes is aggregated for each weight, and transmits the aggregation data to the individual nodes. The individual nodes monitor an unreceived data amount which is a difference between data amounts of the transmitted distributed data and the acquired aggregation data, and when the unreceived data amount becomes equal to or larger than a threshold Ma, stops transmission of the distributed data until the unreceived data amount becomes equal to or smaller than a threshold Mb (Mb

    Computer System and Arithmetic Processing Method

    公开(公告)号:US20230273835A1

    公开(公告)日:2023-08-31

    申请号:US18006934

    申请日:2020-08-05

    IPC分类号: G06F9/50

    CPC分类号: G06F9/5072 G06F9/505

    摘要: A computer system according to the present invention includes N (N is an integer of 2 or more) data output devices, a transmission control device, and an arithmetic device, in which the arithmetic device executes predetermined arithmetic processing on data collected from the N data output devices via a communication network connecting the data output devices and the arithmetic device to each other, the transmission control device controls transmission timing of data output from the N data output devices according to a processing content of the predetermined arithmetic processing executed by the arithmetic device, and the N data storage devices are configured to output the data on the basis of the transmission timing notified by the transmission control device.

    Distributed Deep Learning System
    4.
    发明申请

    公开(公告)号:US20220321641A1

    公开(公告)日:2022-10-06

    申请号:US17627346

    申请日:2019-07-16

    摘要: A distributed deep learning system according to an embodiment includes M distributed processing nodes that perform deep learning of a neural network distributed from each other, and N aggregation processing nodes that are connected to each of the M distributed processing nodes via a first communication line and a second communication line, and perform aggregation of distributed processing results obtained at the M distributed processing nodes via the first communication line. Accordingly, even in a case of a plurality of users sharing the distributed deep learning system at the same time, efficient and stable distributed deep learning processing can be realized.

    Distributed Processing System and Distributed Processing Method

    公开(公告)号:US20220261620A1

    公开(公告)日:2022-08-18

    申请号:US17596070

    申请日:2019-06-03

    IPC分类号: G06N3/063 G06N3/08

    摘要: A distributed processing node transmits distributed data for M groups as intermediate consolidated data from M communication units to a distributed processing node. A distributed processing node generates, for each group, updated intermediate consolidated data from the received intermediate consolidated data and distributed data, and transmits the updated intermediate consolidated data from the M communication units to a distributed processing node. The distributed processing node transmits the received intermediate consolidated data to a distributed processing node as consolidated data. The distributed processing node transmits the received consolidated data to a distributed processing node. Each of the distributed processing nodes updates weights of a neural network, based on the consolidated data.

    Distributed Processing System and Distributed Processing Method

    公开(公告)号:US20210209443A1

    公开(公告)日:2021-07-08

    申请号:US16973717

    申请日:2019-05-05

    IPC分类号: G06N3/04 G06N3/08

    摘要: A first distributed processing node sets, as intermediate aggregated data, distributed data generated by the own node and transmits this data to the distributed processing node having the next number designated in advance. The intermediate distributed processing node excluding the first and last distributed processing nodes calculates, for each of weights corresponding thereto, a sum of the received intermediate aggregated data and distributed data generated by the own node, generates intermediate aggregated data after update, and transmits this data to the distributed processing node having the next number designated in advance. The last distributed processing node calculates, for each of the weights corresponding thereto, a sum of the received intermediate aggregated data and distributed data generated by the own node, generates aggregated data, and transmits this data to the first and intermediate distributed processing nodes. The distributed processing nodes update the weights of a neural network based on this data.

    Distributed Deep Learning System
    8.
    发明申请

    公开(公告)号:US20210034978A1

    公开(公告)日:2021-02-04

    申请号:US16967702

    申请日:2019-02-06

    IPC分类号: G06N3/08 G06N3/04 G06N3/063

    摘要: Each of learning nodes calculates gradients of a loss function from an output result obtained by inputting learning data to a learning target neural network, converts a calculation result into a packet, and transmits the packet to a computing interconnect device. The computing interconnect device receives the packet transmitted from each of the learning nodes, acquires a value of the gradients stored in the packet, calculates a sum of the gradients, converts a calculation result into a packet, and transmits the packet to each of the learning nodes. Each of the learning nodes receives the packet transmitted from the computing interconnect device and updates a constituent parameter of a neural network based on a value stored in the packet.

    Scheduling apparatus, scheduling method and program

    公开(公告)号:US12035295B2

    公开(公告)日:2024-07-09

    申请号:US17612270

    申请日:2019-05-31

    IPC分类号: H04W72/12 H04L5/00

    CPC分类号: H04W72/12 H04L5/0035

    摘要: A scheduling apparatus includes: a division control device configured to divide an entire communicable area into a plurality of areas; a combination generation device (12-1 to 12-N) configured to generate candidate patterns of combinations of transmission points and user terminals for each area; a combination evaluation device (13-1 to 13-N) configured to calculate evaluation values of candidate patterns for each area; an optimal combination holding device (15-1 to 15-N) configured to hold an optimal combination pattern among candidate patterns for each area; a calculation result sharing device configured to output an evaluation value of an optimal combination pattern to the combination evaluation device (13-1 to 13-N) for sharing with the areas as shared information; and an overall transmission weight matrix calculation device configured to calculate a transmission weight matrix for an entire communicable area based on a result obtained by combining optimal combination patterns of the areas.

    Distributed Deep Learning System
    10.
    发明申请

    公开(公告)号:US20230004787A1

    公开(公告)日:2023-01-05

    申请号:US17779736

    申请日:2019-11-27

    IPC分类号: G06N3/063 G06N3/08

    摘要: A distributed deep learning system includes nodes (1-n, n=1, . . . , 4) and a network. The node (1-n) includes GPUs (11-n-1 and 11-n-2), and an FPGA (12-n). The FPGA (12-n) includes a plurality of GPU reception buffers, a plurality of network transmission buffers that store data transferred from the GPU reception buffers, a plurality of network reception buffers that store aggregated data received from other nodes, and a plurality of GPU transmission buffers that store data transferred from the network reception buffers. The GPUs (11-n-1 and 11-n-2) DMA-transfer data to the FPGA (12-n). The data stored in the GPU transmission buffers is DMA-transferred to the GPUs (11-n-1 and 11-n-2).