-
公开(公告)号:US11386327B2
公开(公告)日:2022-07-12
申请号:US15983782
申请日:2018-05-18
Applicant: salesforce.com, inc.
Inventor: Huishuai Zhang , Caiming Xiong
Abstract: Embodiments for training a neural network are provided. A neural network is divided into a first block and a second block, and the parameters in the first block and second block are trained in parallel. To train the parameters, a gradient from a gradient mini-batch included in training data is generated. A curvature-vector product from a curvature mini-batch included in the training data is also generated. The gradient and the curvature-vector product generate a conjugate gradient. The conjugate gradient is used to determine a change in parameters in the first block in parallel with a change in parameters in the second block. The curvature matrix in the curvature-vector product includes zero values when the terms correspond to parameters from different blocks.
-
2.
公开(公告)号:US20180373987A1
公开(公告)日:2018-12-27
申请号:US15983782
申请日:2018-05-18
Applicant: Salesforce.com,inc.
Inventor: Huishuai Zhang , Caiming Xiong
Abstract: Embodiments for training a neural network are provided. A neural network is divided into a first block and a second block, and the parameters in the first block and second block are trained in parallel. To train the parameters, a gradient from a gradient mini-batch included in training data is generated. A curvature-vector product from a curvature mini-batch included in the training data is also generated. The gradient and the curvature-vector product generate a conjugate gradient. The conjugate gradient is used to determine a change in parameters in the first block in parallel with a change in parameters in the second block. The curvature matrix in the curvature-vector product includes zero values when the terms correspond to parameters from different blocks.
-