Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Yao Wang"

1.

发明授权
Benchmarking machine learning models via performance feedback 有权

公开(公告)号：US10949252B1

公开(公告)日：2021-03-16

申请号：US15895747

申请日：2018-02-13

Applicant: Amazon Technologies, Inc.

Inventor： Sandeep Krishnamurthy , Jiajie Chen , Jonathan Esterhazy , Naveen Mysore Nagendra Swamy , Ruofei Yu , Yao Wang , Roshani Nagmote , Hagay Lupesko , Vikram Madan

IPC: G06F9/48 , G06N20/00

Abstract: Techniques for benchmarking a machine learning model/algorithm are described. For example, in some instances a method includes generating an execution plan for benchmarking of at least one task corresponding to a machine learning model based on an identified machine learning model, identified training data, and at least one objective for the benchmarking job; receiving execution statistics about the execution of the task as a part of the benchmarking job according to the execution plan; and updating the execution plan based at least in part on the received execution statistics of the task.

2.

发明授权
Model adaptation 有权

公开(公告)号：US11423283B1

公开(公告)日：2022-08-23

申请号：US15933114

申请日：2018-03-22

Applicant: Amazon Technologies, Inc.

Inventor： Hagay Lupesko , Dominic Rajeev Divakaruni , Jonathan Esterhazy , Sandeep Krishnamurthy , Vikram Madan , Roshani Nagmote , Naveen Mysore Nagendra Swamy , Yao Wang

IPC: G06N3/04 , G06N3/08 , G06N3/10

Abstract: Techniques for model adaptation are described. For example, a method of receiving a call to provide either a model variant or a model variant profile of a deep learning model, the call including desired performance of the deep learning model, a deep learning model identifier, and current edge device characteristics; comparing the received current edge device characteristics to available model variants and profiles based on the desired performance of the deep learning model to generate or select a model variant or profile, the available model variants and profiles determined by the model identifier; and sending the generated or selected model variant or profile to the edge device to use in inference is detailed.

3.

发明授权
Analytical model to optimize deep learning models 有权

公开(公告)号：US12293299B1

公开(公告)日：2025-05-06

申请号：US17338047

申请日：2021-06-03

Applicant: Amazon Technologies, Inc.

Inventor： Vinod Sharma , Yao Wang , Xingyu Zhou , Yanming Wang , Yong Wu , Rui Li

IPC: G06N3/10 , G06F9/38 , G06N3/04

Abstract: Techniques for optimizing and deploying deep neural network (CNN) machine learning models for inference using static analysis are described. A method includes obtaining a deep neural network (DNN) machine learning (ML) model, generating an intermediate representation for the ML model, the intermediate representation including one or more nodes corresponding to one or more operators utilized by the ML model, identifying, for at least one node of the intermediate representation, an optimized schedule for at least one operator corresponding to the at least one node using a static analysis that is based on a hardware-specific cost model, generating an optimized intermediate representation using the optimized schedule that is optimized for execution on a hardware platform, and generating code corresponding to the ML model based at least in part on the optimized intermediate representation, wherein the code is specific to the hardware platform.

4.

发明授权
Unified optimization for convolutional neural network model inference on integrated graphics processing units 有权

公开(公告)号：US11797876B1

公开(公告)日：2023-10-24

申请号：US16453489

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Leyuan Wang , Yida Wang , Mu Li , Zhi Chen , Yizhi Liu , Yao Wang

IPC: G06N20/00 , G06T7/10 , G06T1/20 , G06F9/455 , G06N3/082 , G06F8/41 , G06F18/24

CPC classification number: G06N20/00 , G06F8/443 , G06F8/447 , G06F8/451 , G06F9/45558 , G06F18/24 , G06N3/082 , G06T1/20 , G06T7/10 , G06F2009/4557

Abstract: Techniques for optimizing and deploying convolutional neural network (CNN) machine learning models for inference using integrated graphics processing units are described. A model compilation system optimizes CNN models using optimized vision-specific operators as well as both graph-level tuning and tensor-level tuning to explore the optimization space for achieving heightened performance. The model compilation system may also implement a heuristic-based two-stage technique for falling back certain operators of CNN models to use CPUs when needed or otherwise beneficial.

Patent Agency Ranking