-
公开(公告)号:US20200342316A1
公开(公告)日:2020-10-29
申请号:US16759690
申请日:2018-10-29
申请人: GOOGLE LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US11556786B2
公开(公告)日:2023-01-17
申请号:US16759690
申请日:2018-10-29
申请人: GOOGLE LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US11556381B2
公开(公告)日:2023-01-17
申请号:US17738909
申请日:2022-05-06
申请人: Google LLC
发明人: Jeffrey Adgate Dean , Sudip Roy , Michael Acheson Isard , Aakanksha Chowdhery , Brennan Saeta , Chandramohan Amyangot Thekkath , Daniel William Hurt , Hyeontaek Lim , Laurent El Shafey , Parker Edward Schuh , Paul Ronald Barham , Ruoming Pang , Ryan Sepassi , Sanjay Ghemawat , Yonghui Wu
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing machine learning workloads, e.g., computations for training a neural network or computing an inference using a neural network, across multiple hardware accelerators. One of the systems comprises a plurality of accelerator islands, each hardware accelerator island comprising a respective plurality of hardware devices that include a plurality of hardware accelerators and a corresponding host for each of the plurality of hardware accelerators; and a respective scheduler for each of the accelerator islands that is configured to schedule workloads across the plurality of accelerators and corresponding hosts in the accelerator island, wherein the system is configured to: receive data representing a machine learning workload; and assign a respective portion of the machine learning workload to each of the plurality of accelerator islands for scheduling by the respective scheduler for the accelerator island.
-
公开(公告)号:US20240220796A1
公开(公告)日:2024-07-04
申请号:US18403992
申请日:2024-01-04
申请人: Google LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US20230118303A1
公开(公告)日:2023-04-20
申请号:US18082415
申请日:2022-12-15
申请人: Google LLC
发明人: Jeffrey Adgate Dean , Sudip Roy , Michael Acheson Isard , Aakanksha Chowdhery , Brennan Saeta , Chandramohan Amyangot Thekkath , Daniel William Hurt , Hyeontaek Lim , Laurent El Shafey , Parker Edward Schuh , Paul Ronald Barham , Ruoming Pang , Ryan Sepassi , Sanjay Ghemawat , Yonghui Wu
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing machine learning workloads, e.g., computations for training a neural network or computing an inference using a neural network, across multiple hardware accelerators. One of the systems comprises a plurality of accelerator islands, each hardware accelerator island comprising a respective plurality of hardware devices that include a plurality of hardware accelerators and a corresponding host for each of the plurality of hardware accelerators; and a respective scheduler for each of the accelerator islands that is configured to schedule workloads across the plurality of accelerators and corresponding hosts in the accelerator island, wherein the system is configured to: receive data representing a machine learning workload; and assign a respective portion of the machine learning workload to each of the plurality of accelerator islands for scheduling by the respective scheduler for the accelerator island.
-
公开(公告)号:US20240211752A1
公开(公告)日:2024-06-27
申请号:US18404014
申请日:2024-01-04
申请人: Google LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US20240211751A1
公开(公告)日:2024-06-27
申请号:US18403939
申请日:2024-01-04
申请人: Google LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US20230153613A1
公开(公告)日:2023-05-18
申请号:US18096946
申请日:2023-01-13
申请人: Google LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
公开(公告)号:US12112198B2
公开(公告)日:2024-10-08
申请号:US18082415
申请日:2022-12-15
申请人: Google LLC
发明人: Jeffrey Adgate Dean , Sudip Roy , Michael Acheson Isard , Aakanksha Chowdhery , Brennan Saeta , Chandramohan Amyangot Thekkath , Daniel William Hurt , Hyeontaek Lim , Laurent El Shafey , Parker Edward Schuh , Paul Ronald Barham , Ruoming Pang , Ryan Sepassi , Sanjay Ghemawat , Yonghui Wu
CPC分类号: G06F9/4881 , G06N3/063 , G06N3/08
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing machine learning workloads, e.g., computations for training a neural network or computing an inference using a neural network, across multiple hardware accelerators. One of the systems comprises a plurality of accelerator islands, each hardware accelerator island comprising a respective plurality of hardware devices that include a plurality of hardware accelerators and a corresponding host for each of the plurality of hardware accelerators; and a respective scheduler for each of the accelerator islands that is configured to schedule workloads across the plurality of accelerators and corresponding hosts in the accelerator island, wherein the system is configured to: receive data representing a machine learning workload; and assign a respective portion of the machine learning workload to each of the plurality of accelerator islands for scheduling by the respective scheduler for the accelerator island.
-
公开(公告)号:US20240256859A1
公开(公告)日:2024-08-01
申请号:US18403966
申请日:2024-01-04
申请人: Google LLC
发明人: Noam M. Shazeer , Lukasz Mieczyslaw Kaiser , Etienne Pot , Mohammad Saleh , Ben David Goodrich , Peter J. Liu , Ryan Sepassi
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
-
-
-
-
-
-
-
-
-