-
公开(公告)号:US20220391706A1
公开(公告)日:2022-12-08
申请号:US17831338
申请日:2022-06-02
Applicant: Google LLC
Inventor: Luke Shekerjian Metz , Christian Daniel Freeman , Jascha Narain Sohl-Dickstein , Niruban Maheswaranathan , James Michael Harrison
IPC: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training neural networks using learned optimizers. One of method is for training a neural network layer comprising a plurality of network parameters having a plurality of dimensions each having a plurality of indices, the method comprising: maintaining a set of values corresponding to respective sets of indices of each dimension, each value representing a measure of central tendency of past gradients of the network parameters having an index in the dimension that is in the set of indices; performing a training step to obtain a new gradient for each network parameter; updating each set of values using the new gradients; and for each network parameter: generating an input from the updated sets of values; processing the input using an optimizer neural network to generate an output defining an update for the network parameter; and applying the update.