-
11.
公开(公告)号:US20240333658A1
公开(公告)日:2024-10-03
申请号:US18642717
申请日:2024-04-22
Applicant: Amazon Technologies, Inc.
Inventor: Satya Naga Satis Kumar Gunuputi Alluri Venka , John Baker , Shahab Shekari , Kartik Natarajan , Ruhaab Markas , Ganesh Kumar Gella , Santosh Kumar Ameti
IPC: H04L47/762
CPC classification number: H04L47/762
Abstract: Based on analysis of a workload associated with a throttling key of a client request directed to a first service, a scale-out requirement of the throttling key is obtained at respective resource managers of a plurality of other services which are utilized by the first service to respond to client requests. The resource managers initiate, asynchronously with respect to one another, resource provisioning tasks at each of the other services to fulfill the scale-out requirement. A throttling limit associated with the throttling key is updated to a second throttling key after the resource provisioning tasks are completed by the resource managers, and the updated limit is used to determine whether to accept another client request associated with the throttling key.
-
公开(公告)号:US11997021B1
公开(公告)日:2024-05-28
申请号:US18193502
申请日:2023-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Satya Naga Satis Kumar Gunuputi Alluri Venka , John Baker , Shahab Shekari , Kartik Natarajan , Ruhaab Markas , Ganesh Kumar Gella , Santosh Kumar Ameti
IPC: H04L47/762
CPC classification number: H04L47/762
Abstract: Based on analysis of a workload associated with a throttling key of a client request directed to a first service, a scale-out requirement of the throttling key is obtained at respective resource managers of a plurality of other services which are utilized by the first service to respond to client requests. The resource managers initiate, asynchronously with respect to one another, resource provisioning tasks at each of the other services to fulfill the scale-out requirement. A throttling limit associated with the throttling key is updated to a second throttling key after the resource provisioning tasks are completed by the resource managers, and the updated limit is used to determine whether to accept another client request associated with the throttling key.
-
公开(公告)号:US11436524B2
公开(公告)日:2022-09-06
申请号:US16146331
申请日:2018-09-28
Applicant: Amazon Technologies, Inc.
Inventor: Nikhil Kandoi , Ganesh Kumar Gella , Rama Krishna Sandeep Pokkunuri , Sudhakar Rao Puvvadi , Stefano Stefani , Kalpesh N. Sutaria , Enrico Sartorello , Tania Khattar
IPC: G06N20/00 , G06N5/04 , G06F9/50 , H04L67/1001
Abstract: Techniques for hosting machine learning models are described. In some instances, a method of receiving a request to perform an inference using a particular machine learning model; determining a group of hosts to route the request to, the group of hosts to host a plurality of machine learning models including the particular machine learning model; determining a path to the determined group of hosts; determining a particular host of the group of hosts to perform an analysis of the request based on the determined path, the particular host having the particular machine learning model in memory; routing the request to the particular host of the group of hosts; performing inference on the request using the particular host; and providing a result of the inference to a requester is performed.
-
公开(公告)号:US20210005199A1
公开(公告)日:2021-01-07
申请号:US16943327
申请日:2020-07-30
Applicant: Amazon Technologies, Inc.
Inventor: Ganesh Kumar Gella , Venkata Abhinav Sidharth Bhagavatula , Robert William Serr , Yonnas Getahun Beyene
Abstract: Methods and systems for adding functionality to an account of a language processing system where the functionality is associated with a second account of a first application system is described herein. In a non-limiting embodiment, an individual may log into a first account of a language processing system and log into a second account of a first application system. While logged into both the first account and the second account, a button included within a webpage provided by the first application may be invoked. A request capable of being serviced using the first functionality may be received by the language processing system from a device associated with the first account. The language processing system may send first account data and the second account data to the first application system to facilitate an action associated with the request, thereby enabling the first functionality for the first account.
-
公开(公告)号:US10089983B1
公开(公告)日:2018-10-02
申请号:US15617604
申请日:2017-06-08
Applicant: Amazon Technologies, Inc.
Inventor: Ganesh Kumar Gella , Venkata Abhinav Sidharth Bhagavatula , Robert William Serr , Yonnas Getahun Beyene
Abstract: Methods and systems for adding functionality to an account of a language processing system where the functionality is associated with a second account of a first application system is described herein. In a non-limiting embodiment, an individual may log into a first account of a language processing system and log into a second account of a first application system. While logged into both the first account and the second account, a button included within a webpage provided by the first application may be invoked. A request capable of being serviced using the first functionality may be received by the language processing system from a device associated with the first account. The language processing system may send first account data and the second account data to the first application system to facilitate an action associated with the request, thereby enabling the first functionality for the first account.
-
-
-
-