Patent search ap:("Oracle International Corporation") AND inv:"Dai Hoang Tran" Page 1

1.

发明公开
DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS 审中-公开

公开(公告)号：US20230153688A1

公开(公告)日：2023-05-18

申请号：US17984768

申请日：2022-11-10

Applicant: Oracle International Corporation

Inventor： Duy Vu , Varsha Kuppur Rajendra , Dai Hoang Tran , Shivashankar Subramanian , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson

IPC: G06N20/00 , G06F40/20 , G06F40/49

CPC classification number: G06N20/00 , G06F40/20 , G06F40/49

Abstract: Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes obtaining a training set of labeled examples for training a machine learning model to classify sentiment, searching the training set of labeled examples or an unlabeled corpus of text on target domains for sentiment examples having negation cues, sentiment laden words, words with sentiment prefixes or suffixes, or a combination thereof, rewriting the sentiment examples to create negated versions thereof and generate a labeled negation pair data set, and training the machine learning model using labeled examples from the labeled negation pair data set.

2.

发明申请
FRAMEWORK FOR FOCUSED TRAINING OF LANGUAGE MODELS AND TECHNIQUES FOR END-TO-END HYPERTUNING OF THE FRAMEWORK 有权

公开(公告)号：US20230098783A1

公开(公告)日：2023-03-30

申请号：US17952116

申请日：2022-09-23

Applicant: Oracle International Corporation

Inventor： Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong

IPC: G10L15/06 , G10L15/183

Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.

3.

发明授权
Framework for focused training of language models and techniques for end-to-end hypertuning of the framework 有权

公开(公告)号：US12288550B2

公开(公告)日：2025-04-29

申请号：US17952116

申请日：2022-09-23

Applicant: Oracle International Corporation

Inventor： Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong

IPC: G10L15/06 , G10L15/183

Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.

4.

发明公开
DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS 审中-公开

公开(公告)号：US20230153528A1

公开(公告)日：2023-05-18

申请号：US17984743

申请日：2022-11-10

Applicant: Oracle International Corporation

Inventor： Duy Vu , Varsha Kuppur Rajendra , Dai Hoang Tran , Shivashankar Subramanian , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/279 , G06F40/166 , G06N5/02

CPC classification number: G06F40/279 , G06F40/166 , G06N5/022

Abstract: Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes generating a list of demographic words associated with a demographic group, searching an unlabeled corpus of text to identify unlabeled examples in a target domain comprising at least one demographic word from the list of demographic words, rewriting the unlabeled examples to create one or more versions of each of the unlabeled examples and generate a fairness invariance data set, and training the machine learning model using unlabeled examples from the fairness invariance data set.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification