Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Sudipta Sengupta"

21.

发明公开
RANDOM TOKEN SEGMENTATION FOR TRAINING NEXT TOKEN PREDICTION MODELS 审中-公开

公开(公告)号：US20230419036A1

公开(公告)日：2023-12-28

申请号：US17847118

申请日：2022-06-22

Applicant: Amazon Technologies, Inc.

Inventor： Zijian Wang , Yuchen Tian , Mingyue Shang , Praphruetpong Athiwaratkun , Ming Tan , Parminder Bhatia , Andrew Oliver Arnold , Ramesh M Nallapati , Sudipta Sengupta , Bing Xiang , Atul Deo , Ankur Deepak Desai

IPC: G06F40/284 , G06N20/00 , G06F8/41 , G06F8/30

CPC classification number: G06F40/284 , G06N20/00 , G06F8/427 , G06F8/30

Abstract: Random token segmentation may be implemented for next token prediction. Text data may be received for training a machine learning model to predict a next token given input text tokens. Multiple tokens may be determined from the text data. Different ones of the multiple token may be randomly segmented in to sub-tokens. The machine learning model may then be trained using the multiple tokens including the respective sub-tokens as a training data set.

22.

发明授权
Machine learning inference calls for database query processing 有权

公开(公告)号：US11775868B1

公开(公告)日：2023-10-03

申请号：US17884955

申请日：2022-08-10

Applicant: Amazon Technologies, Inc.

Inventor： Sangil Song , Yongsik Yoon , Kamal Kant Gupta , Saileshwar Krishnamurthy , Stefano Stefani , Sudipta Sengupta , Jaeyun Noh

IPC: G06F7/00 , G06N20/00 , G06F16/242 , G06F16/2453 , G06N5/04

CPC classification number: G06N20/00 , G06F16/2433 , G06F16/24542 , G06N5/04

Abstract: Techniques for making machine learning inference calls for database query processing are described. In some embodiments, a method of making machine learning inference calls for database query processing may include generating a first batch of machine learning requests based at least on a query to be performed on data stored in a database service, wherein the query identifies a machine learning service, sending the first batch of machine learning requests to an input buffer of an asynchronous request handler, the asynchronous request handler to generate a second batch of machine learning requests based on the first batch of machine learning requests, and obtaining a plurality of machine learning responses from an output buffer of the asynchronous request handler, the machine learning responses generated by the machine learning service using a machine learning model in response to receiving the second batch of machine learning requests.

23.

发明授权
Providing query restatements for explaining natural language query results 有权

公开(公告)号：US11726994B1

公开(公告)日：2023-08-15

申请号：US17219694

申请日：2021-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Jun Wang , Zhiguo Wang , Sharanabasappa Parashuram Revadigar , Ramesh M Nallapati , Bing Xiang , Sudipta Sengupta , Yung Haw Wang

IPC: G06F16/242 , G06F16/2452 , G06F16/28 , G06F16/248 , G06F16/2457

CPC classification number: G06F16/243 , G06F16/248 , G06F16/24522 , G06F16/24573 , G06F16/287

Abstract: Query restatements may be provided for explaining natural language query results. A natural language query is received at a natural language query processing system. An intermediate representation of the natural language query is generated for executing the natural language query. The intermediate representation is translated into a natural language restatement of the natural language query. The natural language restatement is provided with a result of the natural language query via an interface of the natural language query processing system.

24.

发明授权
Multiple stage filtering for natural language query processing pipelines 有权

公开(公告)号：US11500865B1

公开(公告)日：2022-11-15

申请号：US17219706

申请日：2021-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Jun Wang , Zhiguo Wang , Sharanabasappa Parashuram Revadigar , Ramesh M Nallapati , Bing Xiang , Stephen Michael Ash , Timothy Jones , Sudipta Sengupta , Rishav Chakravarti , Patrick Ng , Jiarong Jiang , Hanbo Li , Donald Harold Rivers Weidner

IPC: G06F7/00 , G06F16/2452 , G06F40/295 , G06N20/00 , G06F16/242

Abstract: Multiple stage filtering may be implemented for natural language query processing pipelines. Natural language queries may be received at a natural language query processing system and processed through a query language processing pipeline. The query language processing pipeline may filter candidate linkages for a natural language query before performing further filtering of the candidate linkages in the natural language query processing pipeline as part of generating an intermediate representation used to execute the natural language query.

25.

发明申请
NEURAL NETWORK TRAINING UNDER MEMORY RESTRAINT 有权

公开(公告)号：US20210304010A1

公开(公告)日：2021-09-30

申请号：US16836421

申请日：2020-03-31

Applicant: Amazon Technologies, Inc.

Inventor： Sudipta Sengupta , Randy Renfu Huang , Ron Diamant , Vignesh Vivekraja

IPC: G06N3/08 , G06N3/04

Abstract: Methods and systems for training a neural network are provided. In one example, an apparatus comprises a memory that stores instructions; and a hardware processor configured to execute the instructions to: control a neural network processor to perform a loss gradient operation to generate data gradients; after the loss gradient operation completes, control the neural network processor to perform a forward propagation operation to generate intermediate outputs; control the neural network processor to perform a backward propagation operation based on the data gradients and the intermediate outputs to generate weight gradients; receive the weight gradients from the neural network processor; and update weights of a neural network based on the weight gradients.

26.

发明申请
NEURAL NETWORK TRAINING UNDER MEMORY RESTRAINT 有权

公开(公告)号：US20240403646A1

公开(公告)日：2024-12-05

申请号：US18798323

申请日：2024-08-08

Applicant: Amazon Technologies, Inc.

Inventor： Sudipta Sengupta , Randy Renfu Renfu , Ron Diamant , Vignesh Vivekraja

IPC: G06N3/084 , G06N3/04

Abstract: Methods and systems for training a neural network are provided. In one example, an apparatus comprises a memory that stores instructions; and a hardware processor configured to execute the instructions to: control a neural network processor to perform a loss gradient operation to generate data gradients; after the loss gradient operation completes, control the neural network processor to perform a forward propagation operation to generate intermediate outputs; control the neural network processor to perform a backward propagation operation based on the data gradients and the intermediate outputs to generate weight gradients; receive the weight gradients from the neural network processor; and update weights of a neural network based on the weight gradients.

27.

发明公开
CONSTRAINED PREFIX MATCHING FOR GENERATING NEXT TOKEN PREDICTIONS 审中-公开

公开(公告)号：US20230418567A1

公开(公告)日：2023-12-28

申请号：US17847115

申请日：2022-06-22

Applicant: Amazon Technologies, Inc.

Inventor： Praphruetpong Athiwaratkun , Yuchen Tian , Mingyue Shang , Zijian Wang , Ramesh M. Nallapati , Parminder Bhatia , Andrew Oliver Arnold , Bing Xiang , Sudipta Sengupta , Yanitsa Donchev , Srinivas Iragavarapu , Matthew Lee , Vamshidhar Krishnamurthy Dantu , Atul Deo , Ankur Deepak Desai

IPC: G06F8/33

CPC classification number: G06F8/33

Abstract: Pre-fix matching may constrain the generation of next token predictions. Input text to perform a next token prediction may be received. Multiple tokens may be determined from the input text, including a partial token. From possible tokens, one or more matching possible tokens with the partial token may be identified. Next token predictions may then be filtered using the identified possible tokens in order to ensure that the partial token is matched.

28.

发明授权
Attached accelerator based inference service 有权

公开(公告)号：US11599821B2

公开(公告)日：2023-03-07

申请号：US16020776

申请日：2018-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Sudipta Sengupta , Poorna Chand Srinivas Perumalla , Dominic Rajeev Divakaruni , Nafea Bshara , Leo Parker Dirac , Bratin Saha , Matthew James Wood , Andrea Olgiati , Swaminathan Sivasubramanian

IPC: G06N20/00 , G06F9/50 , G06N5/04 , G06F9/455 , G06N3/04 , G06N3/063

Abstract: Implementations detailed herein include description of a computer-implemented method. In an implementation, the method at least includes receiving an application instance configuration, an application of the application instance to utilize a portion of an attached accelerator during execution of a machine learning model and the application instance configuration including: an indication of the central processing unit (CPU) capability to be used, an arithmetic precision of the machine learning model to be used, an indication of the accelerator capability to be used, a storage location of the application, and an indication of an amount of random access memory to use.

29.

发明授权
Attached accelerator selection and placement 有权

公开(公告)号：US11494621B2

公开(公告)日：2022-11-08

申请号：US16020788

申请日：2018-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Sudipta Sengupta , Poorna Chand Srinivas Perumalla , Dominic Rajeev Divakaruni , Nafea Bshara , Leo Parker Dirac , Bratin Saha , Matthew James Wood , Andrea Olgiati , Swaminathan Sivasubramanian

IPC: G06N3/063 , G06N3/08

Abstract: Implementations detailed herein include description of a computer-implemented method. In an implementation, the method at least includes receiving an application instance configuration, an application of the application instance to utilize a portion of an attached accelerator during execution of a machine learning model and the application instance configuration including an arithmetic precision of the machine learning model to be used in determining the portion of the accelerator to provision; provisioning the application instance and the portion of the accelerator attached to the application instance, wherein the application instance is implemented using a physical compute instance in a first location, wherein the portion of the accelerator is implemented using a physical accelerator in the second location; loading the machine learning model onto the portion of the accelerator; and performing inference using the loaded machine learning model of the application using the portion of the accelerator on the attached accelerator.

30.

发明授权
Attached accelerator scaling 有权

公开(公告)号：US11422863B2

公开(公告)日：2022-08-23

申请号：US16020810

申请日：2018-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Sudipta Sengupta , Poorna Chand Srinivas Perumalla , Dominic Rajeev Divakaruni , Nafea Bshara , Leo Parker Dirac , Bratin Saha , Matthew James Wood , Andrea Olgiati , Swaminathan Sivasubramanian

IPC: G06F9/50 , G06N5/04 , G06F9/455 , G06F9/38 , G06N20/00

Abstract: Implementations detailed herein include description of a computer-implemented method. In an implementation, the method at least includes provisioning an application instance and portions of at least one accelerator attached to the application instance to execute a machine learning model of an application of the application instance; loading the machine learning model onto the portions of the at least one accelerator; receiving scoring data in the application; and utilizing each of the portions of the attached at least one accelerator to perform inference on the scoring data in parallel and only using one response from the portions of the accelerator.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification