-
公开(公告)号:US11176489B1
公开(公告)日:2021-11-16
申请号:US15954913
申请日:2018-04-17
Applicant: Amazon Technologies, Inc.
Inventor: Alexander Johannes Smola , Edo Liberty , Mu Li , Leyuan Wang
Abstract: Techniques for determining and utilizing optimal aggregation schedules are described are described. A deep machine learning model can be trained using multiple processing elements implemented in one or multiple computing devices and that are interconnected using one or multiple types of links. An optimal aggregation schedule for such arbitrary topologies can be determined automatically. The determination may include solving a linear program on the spanning tree polytope. The optimal aggregation schedule can be utilized by the multiple processing elements to train the deep machine learning model.
-
公开(公告)号:US11797876B1
公开(公告)日:2023-10-24
申请号:US16453489
申请日:2019-06-26
Applicant: Amazon Technologies, Inc.
CPC classification number: G06N20/00 , G06F8/443 , G06F8/447 , G06F8/451 , G06F9/45558 , G06F18/24 , G06N3/082 , G06T1/20 , G06T7/10 , G06F2009/4557
Abstract: Techniques for optimizing and deploying convolutional neural network (CNN) machine learning models for inference using integrated graphics processing units are described. A model compilation system optimizes CNN models using optimized vision-specific operators as well as both graph-level tuning and tensor-level tuning to explore the optimization space for achieving heightened performance. The model compilation system may also implement a heuristic-based two-stage technique for falling back certain operators of CNN models to use CPUs when needed or otherwise beneficial.
-