-
公开(公告)号:US10679006B2
公开(公告)日:2020-06-09
申请号:US16508066
申请日:2019-07-10
Applicant: Google LLC
Inventor: Quoc V. Le , Hongrae Lee , Wei Yu
IPC: G06F40/284 , G06N3/08 , G06F40/289 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.
-
公开(公告)号:US20190340236A1
公开(公告)日:2019-11-07
申请号:US16508066
申请日:2019-07-10
Applicant: Google LLC
Inventor: Quoc V. Le , Hongrae Lee , Wei Yu
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.
-
公开(公告)号:US20250053751A1
公开(公告)日:2025-02-13
申请号:US18413495
申请日:2024-01-16
Applicant: GOOGLE LLC
Inventor: Oscar Akerlund , Evgeny Sluzhaev , Golnaz Ghiasi , Thang Luong , Yifeng Lu , Igor Petrovski , Agoston Weisz , Wei Yu , Rakesh Shivanna , Michael Andrew Goodman , Apoorv Kulshreshtha , Yu Du , Amin Ghafouri , Sanil Jain , Dustin Tran , Vikas Peswani , YaGuang Li
IPC: G06F40/40
Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input (e.g., that includes at least the NL based input) to generate LLM output, and determine, based on the LLM output, textual content for inclusion in the multi-modal response and multimedia content for inclusion in the multi-modal response. In some implementations, the multimedia content can be obtained based on a multimedia content tag that is included in the LLM output and that is indicative of the multimedia content. In various implementations, the multimedia content can be interleaved between segments of the textual content.
-
4.
公开(公告)号:US11947923B1
公开(公告)日:2024-04-02
申请号:US18520218
申请日:2023-11-27
Applicant: GOOGLE LLC
Inventor: Sanil Jain , Wei Yu , Ágoston Weisz , Michael Andrew Goodman , Diana Avram , Amin Ghafouri , Golnaz Ghiasi , Igor Petrovski , Khyatti Gupta , Oscar Akerlund , Evgeny Sluzhaev , Rakesh Shivanna , Thang Luong , Komal Singh , Yifeng Lu , Vikas Peswani
Abstract: Implementations relate to managing multimedia content that is obtained by large language model(s) (LLM(s)) and/or generated by other generative model(s). Processor(s) of a system can: receive natural language (NL) based input that requests multimedia content, generate a response that is responsive to the NL based input, and cause the response to be rendered. In some implementations, and in generating the response, the processor(s) can process, using a LLM, LLM input to generate LLM output, and determine, based on the LLM output, at least multimedia content to be included in the response. Further, the processor(s) can evaluate the multimedia content to determine whether it should be included in the response. In response to determining that the multimedia content should not be included in the response, the processor(s) can cause the response, including alternative multimedia content or other textual content, to be rendered.
-
公开(公告)号:US20230196105A1
公开(公告)日:2023-06-22
申请号:US18082934
申请日:2022-12-16
Applicant: Google LLC
Inventor: Zirui Wang , Wei Yu , Orhan Firat , Yuan Cao
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating labeled training data using a pre-trained language model neural network. In particular, the language model neural network can generate the text input in a new labeled training example from an input sequence that includes (i) one or more context inputs and (ii) a text label that identifies the ground truth category for the new labeled training example.
-
公开(公告)号:US20200265191A1
公开(公告)日:2020-08-20
申请号:US16865747
申请日:2020-05-04
Applicant: Google LLC
Inventor: Quoc V. Le , Hongrae Lee , Wei Yu
IPC: G06F40/284 , G06N3/08 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.
-
公开(公告)号:US20250131321A1
公开(公告)日:2025-04-24
申请号:US18489503
申请日:2023-10-18
Applicant: Google LLC
Inventor: Wei Yu , Sang Xie , Hieu Hy Pham , Quoc V. Le
IPC: G06N20/00
Abstract: Systems and methods are provided for efficiently calibrating a data mixture for training machine-learned models (e.g., machine-learned sequence processing models, such as transformer-based models). For example, machine-learned models can be trained over a broad dataset that can include multiple different categories of data. The mixture of data categories within the dataset can influence model performance. To improve the performance of machine-learned models, example implementations of the present disclosure can learn a distribution of data categories using a lightweight proxy model before initiating training of a large primary model. In this manner, for instance, example implementations can obtain an improved training data distribution with less computational expense and can leverage the learned training data distribution to better train a large primary model.
-
8.
公开(公告)号:US12277400B1
公开(公告)日:2025-04-15
申请号:US18590498
申请日:2024-02-28
Applicant: GOOGLE LLC
Inventor: Sanil Jain , Wei Yu , Ágoston Weisz , Michael Andrew Goodman , Diana Avram , Amin Ghafouri , Golnaz Ghiasi , Igor Petrovski , Khyatti Gupta , Oscar Akerlund , Evgeny Sluzhaev , Rakesh Shivanna , Thang Luong , Komal Singh , Yifeng Lu , Vikas Peswani
Abstract: Implementations relate to managing multimedia content that is obtained by large language model(s) (LLM(s)) and/or generated by other generative model(s). Processor(s) of a system can: receive natural language (NL) based input that requests multimedia content, generate a response that is responsive to the NL based input, and cause the response to be rendered. In some implementations, and in generating the response, the processor(s) can process, using a LLM, LLM input to generate LLM output, and determine, based on the LLM output, at least multimedia content to be included in the response. Further, the processor(s) can evaluate the multimedia content to determine whether it should be included in the response. In response to determining that the multimedia content should not be included in the response, the processor(s) can cause the response, including alternative multimedia content or other textual content, to be rendered.
-
公开(公告)号:US11954442B2
公开(公告)日:2024-04-09
申请号:US16986534
申请日:2020-08-06
Applicant: Google LLC
Inventor: Chen Liang , Wei Yu , Quoc V. Le , Xinyun Chen , Dengyong Zhou
IPC: G06F40/30 , G06F16/33 , G06F40/20 , G06N3/045 , G06N3/08 , G06N20/00 , G06F40/216 , G06F40/284
CPC classification number: G06F40/30 , G06F16/3347 , G06F40/20 , G06N3/045 , G06N3/08 , G06N20/00 , G06F40/216 , G06F40/284
Abstract: The present disclosure is directed to systems and methods for performing reading comprehension with machine learning. More specifically, the present disclosure is directed to a Neural Symbolic Reader (example implementations of which may be referred to as NeRd), which includes a reader to encode the passage and question, and a programmer to generate a program for multi-step reasoning. By using operators like span selection, the program can be executed over a natural language text passage to generate an answer to a natural language text question. NeRd is domain-agnostic such that the same neural architecture works for different domains. Further, NeRd is compositional such that complex programs can be generated by compositionally applying the symbolic operators.
-
公开(公告)号:US11907674B1
公开(公告)日:2024-02-20
申请号:US18370683
申请日:2023-09-20
Applicant: GOOGLE LLC
Inventor: Oscar Akerlund , Evgeny Sluzhaev , Golnaz Ghiasi , Thang Luong , Yifeng Lu , Igor Petrovski , Ágoston Weisz , Wei Yu , Rakesh Shivanna , Michael Andrew Goodman , Apoorv Kulshreshtha , Yu Du , Amin Ghafouri , Sanil Jain , Dustin Tran , Vikas Peswani , YaGuang Li
CPC classification number: G06F40/40
Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input (e.g., that includes at least the NL based input) to generate LLM output, and determine, based on the LLM output, textual content for inclusion in the multi-modal response and multimedia content for inclusion in the multi-modal response. In some implementations, the multimedia content can be obtained based on a multimedia content tag that is included in the LLM output and that is indicative of the multimedia content. In various implementations, the multimedia content can be interleaved between segments of the textual content.
-
-
-
-
-
-
-
-
-