-
公开(公告)号:US11436524B2
公开(公告)日:2022-09-06
申请号:US16146331
申请日:2018-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Nikhil Kandoi , Ganesh Kumar Gella , Rama Krishna Sandeep Pokkunuri , Sudhakar Rao Puvvadi , Stefano Stefani , Kalpesh N. Sutaria , Enrico Sartorello , Tania Khattar
IPC: G06N20/00 , G06N5/04 , G06F9/50 , H04L67/1001
Abstract: Techniques for hosting machine learning models are described. In some instances, a method of receiving a request to perform an inference using a particular machine learning model; determining a group of hosts to route the request to, the group of hosts to host a plurality of machine learning models including the particular machine learning model; determining a path to the determined group of hosts; determining a particular host of the group of hosts to perform an analysis of the request based on the determined path, the particular host having the particular machine learning model in memory; routing the request to the particular host of the group of hosts; performing inference on the request using the particular host; and providing a result of the inference to a requester is performed.