-
公开(公告)号:US11734937B1
公开(公告)日:2023-08-22
申请号:US16733079
申请日:2020-01-02
Applicant: Amazon Technologies, Inc.
Inventor: Yahor Pushkin , Sravan Babu Bodapati , Rishita Rajal Anubhai , Dimitrios Soulios , Yaser Al-Onaizan
CPC classification number: G06V30/10 , G06F18/2155 , G06N5/04 , G06N20/20
Abstract: Techniques for creating a text classifier machine learning (ML) model are described. According to some embodiments, a language processing service finetunes a language ML model on unlabeled documents of a user, and then trains that finetuned language ML model on labeled documents of the user to be a text classifier that is customized for that user’s domain, e.g., the user’s documents. Additionally, the finetuned language ML model may be trained on labeled documents of the user, for prediction objectives for unlabeled data, before being trained as the text classifier.
-
公开(公告)号:US11366855B2
公开(公告)日:2022-06-21
申请号:US16697948
申请日:2019-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Jean-Pierre Dodel , Zhiheng Huang , Xiaofei Ma , Ramesh M. Nallapati , Krishnakumar Rajagopalan , Milan Saini , Sudipta Sengupta , Saurabh Kumar Singh , Dimitrios Soulios , Ankit Sultania , Dong Wang , Zhiguo Wang , Bing Xiang , Peng Xu , Yong Yuan
IPC: G06F16/00 , G06F16/901 , G06N3/04 , G06F16/2457 , G06F16/903
Abstract: Techniques for searching documents are described. An exemplary method includes receiving a document search query; querying at least one index based upon the document search query to identify matching data; fetching the identified matched data; determining one or more of a top ranked passage and top ranked documents from the set of documents based upon one or more invocations of one or more machine learning models based at least on the fetched identified matched data and the document search query; and returning one or more of the top ranked passage and the proper subset of documents.
-