Training recurrent neural networks to generate sequences

    公开(公告)号:US11954594B1

    公开(公告)日:2024-04-09

    申请号:US17315695

    申请日:2021-05-10

    Applicant: Google LLC

    CPC classification number: G06N3/08

    Abstract: This document generally describes a neural network training system, including one or more computers, that trains a recurrent neural network (RNN) to receive an input, e.g., an input sequence, and to generate a sequence of outputs from the input sequence. In some implementations, training can include, for each position after an initial position in a training target sequence, selecting a preceding output of the RNN to provide as input to the RNN at the position, including determining whether to select as the preceding output (i) a true output in a preceding position in the output order or (ii) a value derived from an output of the RNN for the preceding position in an output order generated in accordance with current values of the parameters of the recurrent neural network.

    PROCESSING AND GENERATING SETS USING RECURRENT NEURAL NETWORKS

    公开(公告)号:US20220180151A1

    公开(公告)日:2022-06-09

    申请号:US17679625

    申请日:2022-02-24

    Applicant: Google LLC

    Abstract: In one aspect, this specification describes a recurrent neural network system implemented by one or more computers that is configured to process input sets to generate neural network outputs for each input set. The input set can be a collection of multiple inputs for which the recurrent neural network should generate the same neural network output regardless of the order in which the inputs are arranged in the collection. The recurrent neural network system can include a read neural network, a process neural network, and a write neural network. In another aspect, this specification describes a system implemented as computer programs on one or more computers in one or more locations that is configured to train a recurrent neural network that receives a neural network input and sequentially emits outputs to generate an output sequence for the neural network input.

    Training distilled machine learning models

    公开(公告)号:US10650328B2

    公开(公告)日:2020-05-12

    申请号:US16368526

    申请日:2019-03-28

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a distilled machine learning model. One of the methods includes training a cumbersome machine learning model, wherein the cumbersome machine learning model is configured to receive an input and generate a respective score for each of a plurality of classes; and training a distilled machine learning model on a plurality of training inputs, wherein the distilled machine learning model is also configured to receive inputs and generate scores for the plurality of classes, comprising: processing each training input using the cumbersome machine learning model to generate a cumbersome target soft output for the training input; and training the distilled machine learning model to, for each of the training inputs, generate a soft output that matches the cumbersome target soft output for the training input.

    Training distilled machine learning models

    公开(公告)号:US10289962B2

    公开(公告)日:2019-05-14

    申请号:US14731349

    申请日:2015-06-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a distilled machine learning model. One of the methods includes training a cumbersome machine learning model, wherein the cumbersome machine learning model is configured to receive an input and generate a respective score for each of a plurality of classes; and training a distilled machine learning model on a plurality of training inputs, wherein the distilled machine learning model is also configured to receive inputs and generate scores for the plurality of classes, comprising: processing each training input using the cumbersome machine learning model to generate a cumbersome target soft output for the training input; and training the distilled machine learning model to, for each of the training inputs, generate a soft output that matches the cumbersome target soft output for the training input.

    Sentence compression using recurrent neural networks

    公开(公告)号:US10229111B1

    公开(公告)日:2019-03-12

    申请号:US15423852

    申请日:2017-02-03

    Applicant: Google LLC

    Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, for generating a sentence summary. In one aspect, the method includes actions of tokenizing the sentence into a plurality of tokens, processing data representative of each token in a first order using an LSTM neural network to initialize an internal state of a second LSTM neural network, processing data representative of each token in a second order using the second LSTM neural network, comprising, for each token in the sentence: processing the data representative of the token using the second LSTM neural network in accordance with a current internal state of the second LSTM neural network to (i) generate an LSTM output for the token, and (ii) to update the current internal state of the second LSTM neural network, and generating the summarized version of the sentence using the outputs of the second LSTM neural network for the tokens.

Patent Agency Ranking