-
公开(公告)号:US12033048B1
公开(公告)日:2024-07-09
申请号:US17107820
申请日:2020-11-30
Applicant: Amazon Technologies, Inc.
Inventor: Laurent Callot , Jasmeet Chhabra , Lifan Chen , Ming Chen , Tim Januschowski , Andrey Kan , Luyang Kong , Baris Kurt , Pramuditha Perera , Mostafa Rahmani , Parminder Bhatia
IPC: H04L29/06 , G06F18/214 , G06N20/20
CPC classification number: G06N20/20 , G06F18/214
Abstract: Techniques for performing anomaly detection are described. An exemplary method includes receiving a request to detect potential anomalies using an anomaly detection system having at least one anomaly scoring model; processing the received data using the anomaly detection system to score the data to determine when the data is potentially anomalous based on one or more thresholds; requesting feedback of at least one determined potential anomaly; receiving feedback on the least one determined potential anomaly; and adjusting at least one of one or more of thresholds used to determine potential anomalies and what is considered an anomaly without adjusting the at least one anomaly scoring model.
-
公开(公告)号:US20230419036A1
公开(公告)日:2023-12-28
申请号:US17847118
申请日:2022-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Zijian Wang , Yuchen Tian , Mingyue Shang , Praphruetpong Athiwaratkun , Ming Tan , Parminder Bhatia , Andrew Oliver Arnold , Ramesh M Nallapati , Sudipta Sengupta , Bing Xiang , Atul Deo , Ankur Deepak Desai
IPC: G06F40/284 , G06N20/00 , G06F8/41 , G06F8/30
CPC classification number: G06F40/284 , G06N20/00 , G06F8/427 , G06F8/30
Abstract: Random token segmentation may be implemented for next token prediction. Text data may be received for training a machine learning model to predict a next token given input text tokens. Multiple tokens may be determined from the text data. Different ones of the multiple token may be randomly segmented in to sub-tokens. The machine learning model may then be trained using the multiple tokens including the respective sub-tokens as a training data set.
-
公开(公告)号:US12242525B1
公开(公告)日:2025-03-04
申请号:US18079803
申请日:2022-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Parminder Bhatia , Thiruvarul Selvan Senthivel , Emine Busra Celikkaya , Jeremy Douglas Fehr , Arjun Mukhopadhyay , Shyam Ramaswamy , Arun Kumar Ravi
IPC: G06F16/36 , G06F16/33 , G06F16/334 , G06N20/00 , G16H50/20
Abstract: Techniques for ontology linking of unstructured text as a service are described. A service may receive a request to link unstructured text to a standardized ontology, and the service may segment and tokenize the unstructured text and send the result to multiple services implementing multiple deep machine learning models trained to identify particular entities and one or more relationships between entities. The service may perform a search of the standardized ontology to identify a set of similar candidates from the standardized ontology for the detected entities and the one or more relationships, and then rank the set of similar candidates from the standardized ontology according to their similarity to the detected entities within the unstructured text. The output from the service may include a result identifying a highest ranked candidate of the set of similar candidates from the standardized ontology for the detected entities within the unstructured text.
-
公开(公告)号:US12019983B1
公开(公告)日:2024-06-25
申请号:US17360890
申请日:2021-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Luyang Kong , Christoper Winestock , Parminder Bhatia , Meng Xiao Wang , Siddhi Pathak
IPC: G06F40/247 , G06F16/36 , G06F16/901 , G06N5/02 , G06N5/022
CPC classification number: G06F40/247 , G06F16/367 , G06F16/9024 , G06N5/02 , G06N5/022
Abstract: Techniques for generating a dataset from a knowledge graph are described. An exemplary method includes receiving a request to generate a dataset from a knowledge graph to be stored in the storage; generating a dataset comprising a plurality of mention-concept pairs from the knowledge graph according to the request based one or more of a synonym-based and graph-based evaluation of the knowledge graph and a custom ontology for the knowledge graph; and storing the generated dataset in the storage.
-
公开(公告)号:US11556579B1
公开(公告)日:2023-01-17
申请号:US16714243
申请日:2019-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Parminder Bhatia , Thiruvarul Selvan Senthivel , Emine Busra Celikkaya , Jeremy Douglas Fehr , Arjun Mukhopadhyay , Shyam Ramaswamy , Arun Kumar Ravi
Abstract: Techniques for ontology linking of unstructured text as a service are described. A service may receive a request to link unstructured text to a standardized ontology, and the service may segment and tokenize the unstructured text and send the result to multiple services implementing multiple deep machine learning models trained to identify particular entities and one or more relationships between entities. The service may perform a search of the standardized ontology to identify a set of similar candidates from the standardized ontology for the detected entities and the one or more relationships, and then rank the set of similar candidates from the standardized ontology according to their similarity to the detected entities within the unstructured text. The output from the service may include a result identifying a highest ranked candidate of the set of similar candidates from the standardized ontology for the detected entities within the unstructured text.
-
公开(公告)号:US11093714B1
公开(公告)日:2021-08-17
申请号:US16293459
申请日:2019-03-05
Applicant: Amazon Technologies, Inc.
Inventor: Parminder Bhatia
IPC: G06F40/295 , G06N3/08 , G06N3/04
Abstract: The present disclosure is directed to optimizing transfer learning for neural networks by creating a dynamic transfer network configuration through gated architecture. In some embodiments, transfer learning implements multiple parameter sharing schemes across a source task and a target task. The gating architecture can learn the optimal parameter sharing schemes as the neural network is trained. In some embodiments, the system can be used in named entity recognition applications where the training data is limited.
-
公开(公告)号:US20230418567A1
公开(公告)日:2023-12-28
申请号:US17847115
申请日:2022-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Praphruetpong Athiwaratkun , Yuchen Tian , Mingyue Shang , Zijian Wang , Ramesh M. Nallapati , Parminder Bhatia , Andrew Oliver Arnold , Bing Xiang , Sudipta Sengupta , Yanitsa Donchev , Srinivas Iragavarapu , Matthew Lee , Vamshidhar Krishnamurthy Dantu , Atul Deo , Ankur Deepak Desai
IPC: G06F8/33
CPC classification number: G06F8/33
Abstract: Pre-fix matching may constrain the generation of next token predictions. Input text to perform a next token prediction may be received. Multiple tokens may be determined from the input text, including a partial token. From possible tokens, one or more matching possible tokens with the partial token may be identified. Next token predictions may then be filtered using the identified possible tokens in order to ensure that the partial token is matched.
-
公开(公告)号:US11487942B1
公开(公告)日:2022-11-01
申请号:US16437338
申请日:2019-06-11
Applicant: Amazon Technologies, Inc.
Inventor: Thiruvarul Selvan Senthivel , Varun Sembium Varadarajan , Borui Zhang , Tiberiu Mircea Doman , Parminder Bhatia , Arun Kumar Ravi , Mohammed Khalilia , Emine Busra Celikkaya
IPC: G06F16/93 , G06F40/30 , G06F40/295 , G06F16/28 , G06F16/31 , G06N3/04 , G06N3/08 , G06F40/284
Abstract: Techniques for entity and relationship detect from unstructured text as a service are described. A service may receive a request to identify entities within a provided unstructured text element, and the service may segment and tokenize the unstructured text and send the result to multiple services implementing multiple deep machine learning models trained to identify particular entities. The service may send additional requests to an additional service or services implementing additional deep machine learning models to identify relationships between detected attributes and ones of the detected entities. The outputs from all services can be analyzed and consolidated into a single result that identifies the entities, any attributes of the entities, and confidence scores indicating the confidence in each detected entity.
-
公开(公告)号:US12141553B2
公开(公告)日:2024-11-12
申请号:US17847113
申请日:2022-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Praphruetpong Athiwaratkun , Zixuan Lin , Ramana Keerthi , Zijian Wang , Yuchen Tian , Hantian Ding , Sri Ranga Akhilesh Bontala , Matthew Lee , Yanitsa Donchev , Ramesh M Nallapati , Parminder Bhatia , Andrew Oliver Arnold , Bing Xiang , Sudipta Sengupta , Rama Krishna Sandeep Pokkunuri , Srinivas Iragavarapu , Atul Deo , Ankur Deepak Desai
Abstract: Evaluation data sets may be programmatically generated for code generation models. An evaluation data set is obtained that includes items that correspond to different evaluation tests for a code generation system. The individual items of the evaluation data set maybe converted, including the conversion of a function signature for the items, the test statements for the items and using a code generation system to generate the body of the function.
-
公开(公告)号:US12014155B2
公开(公告)日:2024-06-18
申请号:US17847115
申请日:2022-06-22
Applicant: Amazon Technologies, Inc.
Inventor: Praphruetpong Athiwaratkun , Yuchen Tian , Mingyue Shang , Zijian Wang , Ramesh M Nallapati , Parminder Bhatia , Andrew Oliver Arnold , Bing Xiang , Sudipta Sengupta , Yanitsa Donchev , Srinivas Iragavarapu , Matthew Lee , Vamshidhar Krishnamurthy Dantu , Atul Deo , Ankur Deepak Desai
IPC: G06F8/33
CPC classification number: G06F8/33
Abstract: Pre-fix matching may constrain the generation of next token predictions. Input text to perform a next token prediction may be received. Multiple tokens may be determined from the input text, including a partial token. From possible tokens, one or more matching possible tokens with the partial token may be identified. Next token predictions may then be filtered using the identified possible tokens in order to ensure that the partial token is matched.
-
-
-
-
-
-
-
-
-