-
公开(公告)号:US12293299B1
公开(公告)日:2025-05-06
申请号:US17338047
申请日:2021-06-03
Applicant: Amazon Technologies, Inc.
Inventor: Vinod Sharma , Yao Wang , Xingyu Zhou , Yanming Wang , Yong Wu , Rui Li
Abstract: Techniques for optimizing and deploying deep neural network (CNN) machine learning models for inference using static analysis are described. A method includes obtaining a deep neural network (DNN) machine learning (ML) model, generating an intermediate representation for the ML model, the intermediate representation including one or more nodes corresponding to one or more operators utilized by the ML model, identifying, for at least one node of the intermediate representation, an optimized schedule for at least one operator corresponding to the at least one node using a static analysis that is based on a hardware-specific cost model, generating an optimized intermediate representation using the optimized schedule that is optimized for execution on a hardware platform, and generating code corresponding to the ML model based at least in part on the optimized intermediate representation, wherein the code is specific to the hardware platform.