DATA PARALLELISM IN DISTRIBUTED TRAINING OF ARTIFICIAL INTELLIGENCE MODELS

发明申请

US20210019152A1 DATA PARALLELISM IN DISTRIBUTED TRAINING OF ARTIFICIAL INTELLIGENCE MODELS 有权

请登陆查看更多内容

专利标题： DATA PARALLELISM IN DISTRIBUTED TRAINING OF ARTIFICIAL INTELLIGENCE MODELS
申请号： US16588402

申请日： 2019-09-30
公开(公告)号： US20210019152A1

公开(公告)日： 2021-01-21
发明人: Bharadwaj Pudipeddi , Marc Tremblay , Sujeeth Subramanya Bharadwaj , Devangkumar Patel , Jinwen Xi , Maral Mesmakhosroshahi
申请人： Microsoft Technology Licensing, LLC
申请人地址： US WA Redmond
专利权人： Microsoft Technology Licensing, LLC
当前专利权人： Microsoft Technology Licensing, LLC
当前专利权人地址： US WA Redmond
主分类号： G06F9/38
IPC分类号： G06F9/38 ; H04L29/08 ; G06N3/08

DATA PARALLELISM IN DISTRIBUTED TRAINING OF ARTIFICIAL INTELLIGENCE MODELS

摘要：

Methods, systems, apparatuses, and computer program products are described herein that enable execution of a large AI model on a memory-constrained target device that is communicatively connected to a parameter server, which stores a master copy of the AI model. The AI model may be dissected into smaller portions (e.g., layers or sub-layers), and each portion may be executed as efficiently as possible on the target device. After execution of one portion of the AI model is finished, another portion of the AI model may be downloaded and executed at the target device. To improve efficiency, the input samples may be divided into microbatches, and a plurality of microbatches executing in sequential order may form a minibatch. The size of the group of microbatches or minibatch can be adjusted to reduce the communication overhead. Multi-level parallel parameters reduction may be performed at the parameter server and the target device.

公开/授权文献

US11436019B2 Data parallelism in distributed training of artificial intelligence models 公开/授权日：2022-09-06

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F9/00	程序控制装置，例如，控制单元（用于外部设备的程序控制入G06F13/10）
G06F9/06	.应用存入的程序的，即应用处理设备的内部存储来接收程序并保持程序的
G06F9/30	..与执行机器指令相关的设计，例如指令译码（用于执行微指令的入G06F9/22；）
G06F9/38	...并行执行指令的，例如，流水线、超前锁定