METHODS AND MODULES FOR ACCELERATING INFERENCE VIA DISTRIBUTED DEVICES

Invention Publication

US20240220829A1 METHODS AND MODULES FOR ACCELERATING INFERENCE VIA DISTRIBUTED DEVICES 审中-公开

Please log in to see more content

Patent Title: METHODS AND MODULES FOR ACCELERATING INFERENCE VIA DISTRIBUTED DEVICES
Application No.: US18087897

Application Date: 2022-12-23
Publication No.: US20240220829A1

Publication Date: 2024-07-04
Inventor: Chenghao Hu , Baochun Li , Zhixiang Chi , Yuanhao Yu , Yang Wang , Jin Tang
Applicant: Huawei Technologies Canada Co., Ltd. , The Governing Council of the University of Toronto
Applicant Address: CA Kanata
Assignee: Huawei Technologies Canada Co., Ltd.,The Governing Council of the University of Toronto
Current Assignee: Huawei Technologies Canada Co., Ltd.,The Governing Council of the University of Toronto
Current Assignee Address: CA Kanata
Main IPC: G06N5/04
IPC: G06N5/04

METHODS AND MODULES FOR ACCELERATING INFERENCE VIA DISTRIBUTED DEVICES

Abstract:

Methods and modules for accelerating inference computations in transformer models using edge devices includes partitioning inputs for each layer and synchronizing between transformer layers. A method includes receiving a transformer input, partitioning the transformer input into two or more first-stage divisions, processing each first-stage division into a processed first-stage division, and combining the processed first-stage divisions into a first output. A module includes a computing device for partitioning a transformer input into two or more divisions, transmitting each of the divisions, and receiving processed divisions, as well as two or more transformer processing units, each for receiving a division from the computing device, processing the division into a processed division, and sending the processed division to the computing device.

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N5/00	利用基于知识的模式的计算机系统
G06N5/04	.推理方法或设备