Patent search ap:("Oracle International Corporation") AND inv:"Yu-Heng Hong" Page 1

1.

发明授权
Batching techniques for handling unbalanced training data for a chatbot 有权

公开(公告)号：US12236321B2

公开(公告)日：2025-02-25

申请号：US17217623

申请日：2021-03-30

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi , Balakota Srinivas Vinnakota , Yu-Heng Hong , Elias Luqman Jalaluddin

IPC: G06N20/00 , G06F16/906 , G06F18/22 , G06F18/2413 , G06F40/30 , G10L15/06 , G10L15/18 , G10L15/197 , G10L15/22

Abstract: The present disclosure relates to chatbot systems, and more particularly, to batching techniques for handling unbalanced training data when training a model such that bias is removed from the trained machine learning model when performing inference. In an embodiment, a plurality of raw utterances is obtained. A bias eliminating distribution is determined and a subset of the plurality of raw utterances is batched according to the bias-reducing distribution. The resulting unbiased training data may be input into a prediction model for training the prediction model. The trained prediction model may be obtained and utilized to predict unbiased results from new inputs received by the trained prediction model.

2.

发明授权
Distance-based logit value for natural language processing 有权

公开(公告)号：US12210842B2

公开(公告)日：2025-01-28

申请号：US18545621

申请日：2023-12-19

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G10L15/16 , G06F40/35 , G06N20/00 , H04L51/02 , G06F40/205 , G06F40/253

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

3.

发明授权
Task-oriented dialog suitable for a standalone device 有权

公开(公告)号：US11574636B2

公开(公告)日：2023-02-07

申请号：US17005847

申请日：2020-08-28

Applicant: Oracle International Corporation

Inventor： Thanh Long Duong , Mark Edward Johnson , Vu Cong Duy Hoang , Tuyen Quang Pham , Yu-Heng Hong , Vladislavs Dovgalecs , Guy Bashkansky , Jason Eric Black , Andrew David Bleeker , Serge Le Huitouze

IPC: G10L15/22

Abstract: Described herein are dialog systems, and techniques for providing such dialog systems, that are suitable for use on standalone computing devices. In some embodiments, a dialog system includes a dialog manager, which takes as input an input logical form, which may be a representation of user input. The dialog, manager may include a dialog state tracker, an execution subsystem, a dialog policy subsystem, and a context stack. The dialog state tracker may generate an intermediate logical form from the input logical form combined with a context from the context stack. The context stack may maintain a history of a current dialog, and thus, the intermediate logical form may include contextual information potentially missing from the input logical form. The execution subsystem may execute the intermediate logical form to produce an execution result, and the dialog policy subsystem may generate an output logical form based on the execution result.

4.

发明申请
DISTANCE-BASED LOGIT VALUES FOR NATURAL LANGUAGE PROCESSING 有权

公开(公告)号：US20250117591A1

公开(公告)日：2025-04-10

申请号：US18988114

申请日：2024-12-19

Applicant: Oracle International Corporation

Inventor： Ying XU , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/35 , G06F40/205 , G06F40/253 , G06N20/00 , H04L51/02

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

5.

发明申请
SYSTEM AND METHOD FOR IMPROVING AN END-TO-END AUTOMATIC SPEECH RECOGNITION MODEL 有权

公开(公告)号：US20250095636A1

公开(公告)日：2025-03-20

申请号：US18823371

申请日：2024-09-03

Applicant: Oracle International Corporation

Inventor： Duy Vu , Yu-Heng Hong , Ying Xu , Philip Arthur

IPC: G10L15/06 , G10L13/08 , G10L15/26

Abstract: Techniques are disclosed herein for improving the performance of an end-to-end (E2E) Automatic Speech Recognition (ASR) model in a target domain. A set of test examples are generated. The set of test examples comprise multiple subsets of test examples and each subset of test examples corresponds to a particular test category. A machine language model is then used to convert audio samples of the subset of test examples to text transcripts. A word error rate is determined for the subset of test examples. A test category is then selected based on the word error rates and a set of training examples is generated for training the ASR model in a particular target domain from a selected subset of test examples The training examples are used to fine-tune the model in the target domain. The trained model is then deployed in a cloud infrastructure of a cloud service provider.

6.

发明公开
DISTANCE-BASED LOGIT VALUE FOR NATURAL LANGUAGE PROCESSING 审中-公开

公开(公告)号：US20240126999A1

公开(公告)日：2024-04-18

申请号：US18545621

申请日：2023-12-19

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G06F40/35 , G06N20/00 , H04L51/02

CPC classification number: G06F40/35 , G06N20/00 , H04L51/02 , G06F40/253

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

7.

发明公开
WIDE AND DEEP NETWORK FOR LANGUAGE DETECTION USING HASH EMBEDDINGS 审中-公开

公开(公告)号：US20230141853A1

公开(公告)日：2023-05-11

申请号：US18052694

申请日：2022-11-04

Applicant: Oracle International Corporation

Inventor： Thanh Tien Vu , Poorya Zaremoodi , Duy Vu , Mark Edward Johnson , Thanh Long Duong , Xu Zhong , Vladislav Blinov , Cong Duy Vu Hoang , Yu-Heng Hong , Vinamr Goel , Philip Victor Ogren , Srinivasa Phani Kumar Gadde , Vishal Vishnoi

IPC: G06F40/263 , G06F16/31

CPC classification number: G06F40/263 , G06F16/325 , H04L51/02

Abstract: Techniques disclosed herein relate generally to language detection. In one particular aspect, a method is provided that includes obtaining a sequence of n-grams of a textual unit; using an embedding layer to obtain an ordered plurality of embedding vectors for the sequence of n-grams; using a deep network to obtain an encoded vector that is based on the ordered plurality of embedding vectors; and using a classifier to obtain a language prediction for the textual unit that is based on the encoded vector. The deep network includes an attention mechanism, and using the embedding layer to obtain the ordered plurality of embedding vectors comprises, for each n-gram in the sequence of n-grams: obtaining hash values for the n-gram; based on the hash values, selecting component vectors from among the plurality of component vectors; and obtaining an embedding vector for the n-gram that is based on the component vectors.

8.

发明申请
OUT-OF-DOMAIN DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING 有权

公开(公告)号：US20220171938A1

公开(公告)日：2022-06-02

申请号：US17452743

申请日：2021-10-28

Applicant: Oracle International Corporation

Inventor： Elias Luqman Jalaluddin , Vishal Vishnoi , Thanh Long Duong , Mark Edward Johnson , Poorya Zaremoodi , Gautam Singaraju , Ying Xu , Vladislav Blinov , Yu-Heng Hong

IPC: G06F40/289 , H04L12/58

Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

9.

发明授权
Out-of-domain data augmentation for natural language processing 有权

公开(公告)号：US12026468B2

公开(公告)日：2024-07-02

申请号：US17452743

申请日：2021-10-28

Applicant: Oracle International Corporation

Inventor： Elias Luqman Jalaluddin , Vishal Vishnoi , Thanh Long Duong , Mark Edward Johnson , Poorya Zaremoodi , Gautam Singaraju , Ying Xu , Vladislav Blinov , Yu-Heng Hong

IPC: G06F40/289 , G06F40/30 , G06N3/08 , H04L51/02

CPC classification number: G06F40/289 , G06F40/30 , G06N3/08 , H04L51/02

Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

10.

发明授权
Distance-based logit value for natural language processing 有权

公开(公告)号：US12019994B2

公开(公告)日：2024-06-25

申请号：US17456916

申请日：2021-11-30

Applicant: Oracle International Corporation

Inventor： Ying Xu , Poorya Zaremoodi , Thanh Tien Vu , Cong Duy Vu Hoang , Vladislav Blinov , Yu-Heng Hong , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Vishal Vishnoi , Elias Luqman Jalaluddin , Manish Parekh , Thanh Long Duong , Mark Edward Johnson

IPC: G06F17/18 , G06F40/35 , G06N20/00 , H04L51/02 , G06F40/205 , G06F40/253

CPC classification number: G06F40/35 , G06N20/00 , H04L51/02 , G06F40/205 , G06F40/253

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification