-
公开(公告)号:US20230169333A1
公开(公告)日:2023-06-01
申请号:US17862881
申请日:2022-07-12
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: CHANGIN CHOI , Gunhee KIM , YONGDEOK KIM , MYEONG WOO KIM , SEUNGWON LEE , NARANKHUU TUVSHINJARGAL
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Disclosed are a training method and apparatus for distributed training of a neural network, the training apparatus including processors configured to perform distributed training, wherein each of the processors is further configured to perform a forward direction operation for layers of the neural network, determine a loss of the neural network based on the forward direction operation, determine a local gradient for each layer of the neural network by performing a backward direction operation for the layers of the neural network based on the loss, determine whether to perform gradient clipping for a local gradient determined for a previous layer, in response to determining a local gradient for a current layer through the backward direction operation, determine an aggregated gradient based on the backward direction operation and the gradient clipping performed by each of the processors, and update parameters of the neural network based on the aggregated gradient.