-
公开(公告)号:US20240020516A1
公开(公告)日:2024-01-18
申请号:US18222395
申请日:2023-07-14
Applicant: Google LLC
Inventor: Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, Jr.
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.
-
公开(公告)号:US11886976B1
公开(公告)日:2024-01-30
申请号:US18222395
申请日:2023-07-14
Applicant: Google LLC
Inventor: Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, Jr.
IPC: G06N3/0455
CPC classification number: G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.
-
公开(公告)号:US20240289552A1
公开(公告)日:2024-08-29
申请号:US18564859
申请日:2022-05-27
Applicant: Google LLC
Inventor: Yi Tay , Dara Bahri , Donald Arthur Metzler, Jr. , Hyung Won Chung , Jai Prakash Gupta , Sebastian Nikolas Ruder , Simon Baumgartner , Vinh Quoc Tran , Zhen Qin
IPC: G06F40/284
CPC classification number: G06F40/284
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on an input sequence of characters that has a respective character at each of a plurality of character positions to generate a network output. One of the systems includes a neural network configured to perform the machine learning task, the neural network comprising a gradient-based sub-word tokenizer and an output neural network. The gradient-based sub-word tokenizer is configured to apply a learned, i.e., flexible, sub-word tokenization strategy to the input sequence of characters to generate a sequence of latent sub-word representations. The output neural network is configured to process the latent sub-word representation to generate the network output for the task.
-
-