Skimming text using recurrent neural networks

    公开(公告)号:US10679006B2

    公开(公告)日:2020-06-09

    申请号:US16508066

    申请日:2019-07-10

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.

    PROCESSING SEQUENTIAL DATA USING RECURRENT NEURAL NETWORKS

    公开(公告)号:US20190340236A1

    公开(公告)日:2019-11-07

    申请号:US16508066

    申请日:2019-07-10

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.

    GENERATING LABELED TRAINING DATA USING A PRE-TRAINED LANGUAGE MODEL NEURAL NETWORK

    公开(公告)号:US20230196105A1

    公开(公告)日:2023-06-22

    申请号:US18082934

    申请日:2022-12-16

    Applicant: Google LLC

    CPC classification number: G06N3/08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating labeled training data using a pre-trained language model neural network. In particular, the language model neural network can generate the text input in a new labeled training example from an input sequence that includes (i) one or more context inputs and (ii) a text label that identifies the ground truth category for the new labeled training example.

    SKIMMING DATA SEQUENCES USING RECURRENT NEURAL NETWORKS

    公开(公告)号:US20200265191A1

    公开(公告)日:2020-08-20

    申请号:US16865747

    申请日:2020-05-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.

    Efficient Training Mixture Calibration for Training Machine-Learned Models

    公开(公告)号:US20250131321A1

    公开(公告)日:2025-04-24

    申请号:US18489503

    申请日:2023-10-18

    Applicant: Google LLC

    Abstract: Systems and methods are provided for efficiently calibrating a data mixture for training machine-learned models (e.g., machine-learned sequence processing models, such as transformer-based models). For example, machine-learned models can be trained over a broad dataset that can include multiple different categories of data. The mixture of data categories within the dataset can influence model performance. To improve the performance of machine-learned models, example implementations of the present disclosure can learn a distribution of data categories using a lightweight proxy model before initiating training of a large primary model. In this manner, for instance, example implementations can obtain an improved training data distribution with less computational expense and can leverage the learned training data distribution to better train a large primary model.

Patent Agency Ranking