-
公开(公告)号:US20230098783A1
公开(公告)日:2023-03-30
申请号:US17952116
申请日:2022-09-23
Applicant: Oracle International Corporation
Inventor: Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong
IPC: G10L15/06 , G10L15/183
Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.
-
公开(公告)号:US20210374361A1
公开(公告)日:2021-12-02
申请号:US16890097
申请日:2020-06-02
Applicant: Oracle International Corporation
Inventor: Michael Louis Wick , Jean-Baptiste Frederic George Tristan , Adam Craig Pocock , Katherine Silverstein
IPC: G06F40/58
Abstract: A method for training a language model using negative data may include accessing a first training corpus comprising positive training data and accessing a second training corpus comprising negative training data. The method may further include training a first language model using at least the first training corpus, the second training corpus, and a maximum likelihood function. The maximum likelihood function may maximize the likelihood of the first language model predicting the positive training data while minimizing the likelihood of the first language model predicting the negative training data.
-
公开(公告)号:US12288550B2
公开(公告)日:2025-04-29
申请号:US17952116
申请日:2022-09-23
Applicant: Oracle International Corporation
Inventor: Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong
IPC: G10L15/06 , G10L15/183
Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.
-
公开(公告)号:US10410139B2
公开(公告)日:2019-09-10
申请号:US15168309
申请日:2016-05-31
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Pallika Haridas Kanani , Michael Louis Wick , Katherine Silverstein
Abstract: A system that performs natural language processing receives a text corpus that includes a plurality of documents and receives a knowledge base. The system generates a set of document n-grams from the text corpus and considers all n-grams as candidate mentions. The system, for each candidate mention, queries the knowledge base and in response retrieves results. From the results retrieved by the queries, the system generates a search space and generates a joint model from the search space.
-
-
-