Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Zhi Chen"

1.

发明申请
HIERARCHICAL PARTITIONING OF OPERATORS 有权

公开(公告)号：US20210158131A1

公开(公告)日：2021-05-27

申请号：US16698236

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang

IPC: G06N3/063 , G06N3/04

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

2.

发明授权
Hierarchical partitioning of operators 有权

公开(公告)号：US12182688B2

公开(公告)日：2024-12-31

申请号：US16698236

申请日：2019-11-27

Applicant: Amazon Technologies, Inc.

Inventor： Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang

IPC: G06N3/063 , G06N3/04

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

3.

发明授权
Unified optimization for convolutional neural network model inference on integrated graphics processing units 有权

公开(公告)号：US11797876B1

公开(公告)日：2023-10-24

申请号：US16453489

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Leyuan Wang , Yida Wang , Mu Li , Zhi Chen , Yizhi Liu , Yao Wang

IPC: G06N20/00 , G06T7/10 , G06T1/20 , G06F9/455 , G06N3/082 , G06F8/41 , G06F18/24

CPC classification number: G06N20/00 , G06F8/443 , G06F8/447 , G06F8/451 , G06F9/45558 , G06F18/24 , G06N3/082 , G06T1/20 , G06T7/10 , G06F2009/4557

Abstract: Techniques for optimizing and deploying convolutional neural network (CNN) machine learning models for inference using integrated graphics processing units are described. A model compilation system optimizes CNN models using optimized vision-specific operators as well as both graph-level tuning and tensor-level tuning to explore the optimization space for achieving heightened performance. The model compilation system may also implement a heuristic-based two-stage technique for falling back certain operators of CNN models to use CPUs when needed or otherwise beneficial.

Patent Agency Ranking