METHOD AND APPARATUS WITH DISTRIBUTED TRAINING OF NEURAL NETWORK

    公开(公告)号:US20230169333A1

    公开(公告)日:2023-06-01

    申请号:US17862881

    申请日:2022-07-12

    CPC classification number: G06N3/08

    Abstract: Disclosed are a training method and apparatus for distributed training of a neural network, the training apparatus including processors configured to perform distributed training, wherein each of the processors is further configured to perform a forward direction operation for layers of the neural network, determine a loss of the neural network based on the forward direction operation, determine a local gradient for each layer of the neural network by performing a backward direction operation for the layers of the neural network based on the loss, determine whether to perform gradient clipping for a local gradient determined for a previous layer, in response to determining a local gradient for a current layer through the backward direction operation, determine an aggregated gradient based on the backward direction operation and the gradient clipping performed by each of the processors, and update parameters of the neural network based on the aggregated gradient.

Patent Agency Ranking