-
公开(公告)号:US10949252B1
公开(公告)日:2021-03-16
申请号:US15895747
申请日:2018-02-13
Applicant: Amazon Technologies, Inc.
Inventor: Sandeep Krishnamurthy , Jiajie Chen , Jonathan Esterhazy , Naveen Mysore Nagendra Swamy , Ruofei Yu , Yao Wang , Roshani Nagmote , Hagay Lupesko , Vikram Madan
Abstract: Techniques for benchmarking a machine learning model/algorithm are described. For example, in some instances a method includes generating an execution plan for benchmarking of at least one task corresponding to a machine learning model based on an identified machine learning model, identified training data, and at least one objective for the benchmarking job; receiving execution statistics about the execution of the task as a part of the benchmarking job according to the execution plan; and updating the execution plan based at least in part on the received execution statistics of the task.
-
公开(公告)号:US11423283B1
公开(公告)日:2022-08-23
申请号:US15933114
申请日:2018-03-22
Applicant: Amazon Technologies, Inc.
Inventor: Hagay Lupesko , Dominic Rajeev Divakaruni , Jonathan Esterhazy , Sandeep Krishnamurthy , Vikram Madan , Roshani Nagmote , Naveen Mysore Nagendra Swamy , Yao Wang
Abstract: Techniques for model adaptation are described. For example, a method of receiving a call to provide either a model variant or a model variant profile of a deep learning model, the call including desired performance of the deep learning model, a deep learning model identifier, and current edge device characteristics; comparing the received current edge device characteristics to available model variants and profiles based on the desired performance of the deep learning model to generate or select a model variant or profile, the available model variants and profiles determined by the model identifier; and sending the generated or selected model variant or profile to the edge device to use in inference is detailed.
-
公开(公告)号:US12293299B1
公开(公告)日:2025-05-06
申请号:US17338047
申请日:2021-06-03
Applicant: Amazon Technologies, Inc.
Inventor: Vinod Sharma , Yao Wang , Xingyu Zhou , Yanming Wang , Yong Wu , Rui Li
Abstract: Techniques for optimizing and deploying deep neural network (CNN) machine learning models for inference using static analysis are described. A method includes obtaining a deep neural network (DNN) machine learning (ML) model, generating an intermediate representation for the ML model, the intermediate representation including one or more nodes corresponding to one or more operators utilized by the ML model, identifying, for at least one node of the intermediate representation, an optimized schedule for at least one operator corresponding to the at least one node using a static analysis that is based on a hardware-specific cost model, generating an optimized intermediate representation using the optimized schedule that is optimized for execution on a hardware platform, and generating code corresponding to the ML model based at least in part on the optimized intermediate representation, wherein the code is specific to the hardware platform.
-
公开(公告)号:US11797876B1
公开(公告)日:2023-10-24
申请号:US16453489
申请日:2019-06-26
Applicant: Amazon Technologies, Inc.
CPC classification number: G06N20/00 , G06F8/443 , G06F8/447 , G06F8/451 , G06F9/45558 , G06F18/24 , G06N3/082 , G06T1/20 , G06T7/10 , G06F2009/4557
Abstract: Techniques for optimizing and deploying convolutional neural network (CNN) machine learning models for inference using integrated graphics processing units are described. A model compilation system optimizes CNN models using optimized vision-specific operators as well as both graph-level tuning and tensor-level tuning to explore the optimization space for achieving heightened performance. The model compilation system may also implement a heuristic-based two-stage technique for falling back certain operators of CNN models to use CPUs when needed or otherwise beneficial.
-
-
-