-
公开(公告)号:US20210158131A1
公开(公告)日:2021-05-27
申请号:US16698236
申请日:2019-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang
Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.
-
公开(公告)号:US12182688B2
公开(公告)日:2024-12-31
申请号:US16698236
申请日:2019-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Animesh Jain , Yizhi Liu , Hongbin Zheng , Jeffrey T. Huynh , Haichen Li , Drazen Borkovic , Jindrich Zejda , Richard John Heaton , Randy Renfu Huang , Zhi Chen , Yida Wang
Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.
-
公开(公告)号:US11797876B1
公开(公告)日:2023-10-24
申请号:US16453489
申请日:2019-06-26
Applicant: Amazon Technologies, Inc.
CPC classification number: G06N20/00 , G06F8/443 , G06F8/447 , G06F8/451 , G06F9/45558 , G06F18/24 , G06N3/082 , G06T1/20 , G06T7/10 , G06F2009/4557
Abstract: Techniques for optimizing and deploying convolutional neural network (CNN) machine learning models for inference using integrated graphics processing units are described. A model compilation system optimizes CNN models using optimized vision-specific operators as well as both graph-level tuning and tensor-level tuning to explore the optimization space for achieving heightened performance. The model compilation system may also implement a heuristic-based two-stage technique for falling back certain operators of CNN models to use CPUs when needed or otherwise beneficial.
-
-