-
公开(公告)号:US20220245428A1
公开(公告)日:2022-08-04
申请号:US17592796
申请日:2022-02-04
Applicant: Google LLC
Inventor: Yi Tay , Da-Cheng Juan , Dara Bahri , Donald Arthur Metzler, JR. , Jai Prakash Gupta , Mostafa Dehghani , Phillip Pham , Vamsi Krishna Aribandi , Zhen Qin
Abstract: Provided are machine-learned attention models that feature omnidirectional processing, example implementations of which can be referred to as Omnidirectional Representations from Transformers (OMNINET). In example models described in the present disclosure, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in some or all of the other tokens across the entire network.
-
公开(公告)号:US11886976B1
公开(公告)日:2024-01-30
申请号:US18222395
申请日:2023-07-14
Applicant: Google LLC
Inventor: Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, Jr.
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.
-
公开(公告)号:US20240289552A1
公开(公告)日:2024-08-29
申请号:US18564859
申请日:2022-05-27
Applicant: Google LLC
Inventor: Yi Tay , Dara Bahri , Donald Arthur Metzler, Jr. , Hyung Won Chung , Jai Prakash Gupta , Sebastian Nikolas Ruder , Simon Baumgartner , Vinh Quoc Tran , Zhen Qin
IPC: G06F40/284
CPC classification number: G06F40/284
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on an input sequence of characters that has a respective character at each of a plurality of character positions to generate a network output. One of the systems includes a neural network configured to perform the machine learning task, the neural network comprising a gradient-based sub-word tokenizer and an output neural network. The gradient-based sub-word tokenizer is configured to apply a learned, i.e., flexible, sub-word tokenization strategy to the input sequence of characters to generate a sequence of latent sub-word representations. The output neural network is configured to process the latent sub-word representation to generate the network output for the task.
-
公开(公告)号:US20240169184A1
公开(公告)日:2024-05-23
申请号:US18426212
申请日:2024-01-29
Applicant: Google LLC
Inventor: Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, JR.
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.
-
公开(公告)号:US20240020516A1
公开(公告)日:2024-01-18
申请号:US18222395
申请日:2023-07-14
Applicant: Google LLC
Inventor: Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, Jr.
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.
-
-
-
-