Patent search ap:("Google LLC") AND inv:"Thang Minh Luong" Page 1

1.

发明申请
Contrastive Pre-Training for Language Tasks 有权

公开(公告)号：US20250131208A1

公开(公告)日：2025-04-24

申请号：US18990884

申请日：2024-12-20

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Kevin Stefan Clark

IPC: G06F40/40 , G06N5/04 , G06N20/00

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

2.

发明授权
Contrastive pre-training for language tasks 有权

公开(公告)号：US11914969B2

公开(公告)日：2024-02-27

申请号：US17947843

申请日：2022-09-19

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Kevin Stefan Clark

IPC: G06F17/00 , G06F40/40 , G06N5/04 , G06N20/00

CPC classification number: G06F40/40 , G06N5/04 , G06N20/00

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

3.

发明授权
Contrastive pre-training for language tasks 有权

公开(公告)号：US12210845B2

公开(公告)日：2025-01-28

申请号：US18422856

申请日：2024-01-25

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Kevin Stefan Clark

IPC: G06F17/00 , G06F40/40 , G06N5/04 , G06N20/00

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

4.

发明授权
Training machine learning models using unsupervised data augmentation 有权

公开(公告)号：US12118064B2

公开(公告)日：2024-10-15

申请号：US17606190

申请日：2020-04-24

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Qizhe Xie , Zihang Dai

IPC: G06F18/21 , G06F18/214 , G06N3/08

CPC classification number: G06F18/217 , G06F18/2148 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a IT machine learning model. One of the methods includes receiving training data comprising a plurality of unlabeled training inputs and a plurality of labeled training inputs; generating augmented training data, comprising generating, for each of the plurality of unlabeled training inputs, a respective augmented training input by applying a data augmentation technique to the unlabeled training input; and training the machine learning model on the augmented training data. In particular, but not exclusively, the model may be trained for perceptual tasks (e.g. tasks relating to vision or speech).

5.

发明授权
Computationally efficient expressive output layers for neural networks 有权

公开(公告)号：US11481609B2

公开(公告)日：2022-10-25

申请号：US15931408

申请日：2020-05-13

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Zhilin Yang

IPC: G06N3/04 , G06N3/063 , G06N7/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for incorporating a computationally efficient expressive output layer in a neural network. The output layer is configured to map a received hidden state to a probability distribution over a vocabulary of possible outputs by generating, from the hidden state, a respective context embedding for each of a plurality of gates; for each of the possible outputs in the vocabulary, computing a gated logit for the possible output by applying an output embedding for the possible output to the weighed sum; and generating the probability distribution over the vocabulary of possible outputs by applying a softmax to the gated logits for the possible outputs in the vocabulary.

6.

发明申请
Contrastive Pre-Training for Language Tasks 有权

公开(公告)号：US20210089724A1

公开(公告)日：2021-03-25

申请号：US17026780

申请日：2020-09-21

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Kevin Stefan Clark

IPC: G06F40/40 , G06N20/00 , G06N5/04

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

7.

发明公开
Contrastive Pre-Training for Language Tasks 审中-公开

公开(公告)号：US20240160857A1

公开(公告)日：2024-05-16

申请号：US18422856

申请日：2024-01-25

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Kevin Stefan Clark

IPC: G06F40/40 , G06N5/04 , G06N20/00

CPC classification number: G06F40/40 , G06N5/04 , G06N20/00

Abstract: Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

8.

发明申请
SELF-TRAINING TECHNIQUE FOR GENERATING NEURAL NETWORK MODELS 有权

公开(公告)号：US20220083840A1

公开(公告)日：2022-03-17

申请号：US17018555

申请日：2020-09-11

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Qizhe Xie

IPC: G06N3/04 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, used to implement a self-training technique for generating neural network (NN) models. A first model is generated in response to training a first NN using labeled data. A respective pseudo label is generated for each item of unlabeled data when items of unlabeled data are processed using the first model. A second NN is used to process each item of a combined dataset to train the second NN. The combined dataset includes items of labeled data and a corresponding item for each respective pseudo label. Attributes of items in the combined dataset are modified to inject noise into the combined dataset when the second NN is trained. A second model is generated after the second NN is trained by processing items in the combined dataset, including processing items that represent the noise injected into the combined dataset.

9.

发明申请
Energy-Based Language Models 有权

公开(公告)号：US20220067304A1

公开(公告)日：2022-03-03

申请号：US17458678

申请日：2021-08-27

Applicant: Google LLC

Inventor： Thang Minh Luong , Quoc V. Le , Kevin Stefan Clark

IPC: G06F40/40 , G06F40/284

Abstract: Systems and methods are provided for training and using energy-based language models such as cloze language models. In particular, one aspect of the present disclosure is directed to an energy-based cloze language model for representation learning over text. In some instances, the models provided herein can be referred to as the “Electric” model. Similar to the BERT model, example models proposed herein can be a conditional generative model of tokens given their contexts. However, example models proposed herein do not mask text or output a full distribution over tokens that could occur in a context. Instead, the example proposed models assign a scalar energy score to each input token. Another aspect of the present disclosure provides techniques to train the proposed models to assign low energies to data tokens and high energies to other ones using an algorithm based on noise-contrastive estimation.

10.

发明申请
SEQUENCE PROCESSING USING ONLINE ATTENTION 审中-公开

公开(公告)号：US20190332919A1

公开(公告)日：2019-10-31

申请号：US16504924

申请日：2019-07-08

Applicant: Google LLC

Inventor： Ron J. Weiss , Thang Minh Luong , Peter J. Liu , Colin Abraham Raffel , Douglas Eck

IPC: G06N3/04 , G06F17/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence including a respective output at each of multiple output time steps from respective encoded representations of inputs in an input sequence. The method includes, for each output time step, starting from the position, in the input order, of the encoded representation that was selected as a preceding context vector at a preceding output time step, traversing the encoded representations until an encoded representation is selected as a current context vector at the output time step. A decoder neural network processes the current context vector and a preceding output at the preceding output time step to generate a respective output score for each possible output and to update the hidden state of the decoder recurrent neural network. An output is selected for the output time step using the output scores.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification