-
公开(公告)号:US20250094717A1
公开(公告)日:2025-03-20
申请号:US18885356
申请日:2024-09-13
Applicant: Oracle International Corporation
Inventor: Aashna Devang Kanuga , Yingqiong Shi , Charles Woodrow Dickstein , Xin Xu , King-Hwa Lee
IPC: G06F40/289 , G06F16/33 , G06F40/205 , G06F40/40
Abstract: Techniques are disclosed for returning references associated with an answer to a query. The techniques include accessing a text portion and identifying a plurality of sentences in the text portion. Each of the sentences is embedded to generate a respective plurality of text sentence embeddings. The text portion or a derivative thereof and a query are provided to a language model and a response to the query based on the text portion is received from the language model. A plurality of sentences are identified in the response. The plurality of sentences in the response is embedded to generate a plurality of response embeddings. The response embeddings are compared to the sentence embeddings to generate a similarity score for each sentence embedding-response embedding pair. Based on the similarity scores, an indication of a subset of the plurality of sentences is output with the response to the query.
-
公开(公告)号:US20250094455A1
公开(公告)日:2025-03-20
申请号:US18885347
申请日:2024-09-13
Applicant: Oracle International Corporation
Inventor: Umanga Bista , Ying Xu , Aashna Devang Kanuga , Xin Xu , Vishal Vishnoi , Charles Woodrow Dickstein
IPC: G06F16/332 , G06F16/33
Abstract: Techniques are disclosed herein for contextual query rewriting. The techniques include inputting a first user utterance and a conversation history to a first language model. The first language model identifies an ambiguity in the first user utterance and one or more terms in the conversation history to resolve the ambiguity, modifies the first user utterance to include the one or more terms identified to resolve the ambiguity to generate a modified utterance, and outputs the modified utterance. The computing system provides the modified utterance as input to a second language model. The second language model performs a natural language processing task based on the input modified utterance and outputs a result. The computing system outputs a response to the first user utterance based on the result.
-
公开(公告)号:US20240134850A1
公开(公告)日:2024-04-25
申请号:US18321144
申请日:2023-05-21
Applicant: Oracle International Corporation
Inventor: Chang Xu , Poorya Zaremoodi , Cong Duy Vu Hoang , Nitika Mathur , Philip Arthur , Steve Wai-Chun Siu , Aashna Devang Kanuga , Gioacchino Tangari , Mark Edward Johnson , Thanh Long Duong , Vishal Vishnoi , Stephen Andrew McRitchie , Christopher Mark Broadbent
IPC: G06F16/2452 , G06F40/211 , G06F40/30
CPC classification number: G06F16/24522 , G06F40/211 , G06F40/30
Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
-
14.
公开(公告)号:US20240062021A1
公开(公告)日:2024-02-22
申请号:US18107624
申请日:2023-02-09
Applicant: Oracle International Corporation
Inventor: Gioacchino Tangari , Cong Duy Vu Hoang , Mark Edward Johnson , Poorya Zaremoodi , Nitika Mathur , Aashna Devang Kanuga , Thanh Long Duong
IPC: G06F40/58 , G06F40/253
CPC classification number: G06F40/58 , G06F40/253
Abstract: Techniques are disclosed herein for calibrating confidence scores of a machine learning model trained to translate natural language to a meaning representation language. The techniques include obtaining one or more raw beam scores generated from one or more beam levels of a decoder of a machine learning model trained to translate natural language to a logical form, where each of the one or more raw beam scores is a conditional probability of a sub-tree determined by a heuristic search algorithm of the decoder at one of the one or more beam levels, classifying, by a calibration model, a logical form output by the machine learning model as correct or incorrect based on the one or more raw beam scores, and providing the logical form with a confidence score that is determined based on the classifying of the logical form.
-
15.
公开(公告)号:US20240061832A1
公开(公告)日:2024-02-22
申请号:US18209844
申请日:2023-06-14
Applicant: Oracle International Corporation
Inventor: Cong Duy Vu Hoang , Stephen Andrew McRitchie , Mark Edward Johnson , Shivashankar Subramanian , Aashna Devang Kanuga , Nitika Mathur , Gioacchino Tangari , Steve Wai-Chun Siu , Poorya Zaremoodi , Vasisht Raghavendra , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Christopher Mark Broadbent , Philip Arthur , Syed Najam Abbas Zaidi
IPC: G06F16/2452 , G06F16/2455 , G06F16/242
CPC classification number: G06F16/24522 , G06F16/24561 , G06F16/2433
Abstract: Techniques are disclosed herein for converting a natural language utterance to an intermediate database query representation. An input string is generated by concatenating a natural language utterance with a database schema representation for a database. Based on the input string, a first encoder generates one or more embeddings of the natural language utterance and the database schema representation. A second encoder encodes relations between elements in the database schema representation and words in the natural language utterance based on the one or more embeddings. A grammar-based decoder generates an intermediate database query representation based on the encoded relations and the one or more embeddings. Based on the intermediate database query representation and an interface specification, a database query is generated in a database query language.
-
公开(公告)号:US11804219B2
公开(公告)日:2023-10-31
申请号:US17345288
申请日:2021-06-11
Applicant: Oracle International Corporation
Inventor: Srinivasa Phani Kumar Gadde , Yuanxu Wu , Aashna Devang Kanuga , Elias Luqman Jalaluddin , Vishal Vishnoi , Mark Edward Johnson
IPC: G10L15/197 , G10L15/06 , G10L15/26 , G06F40/186 , G06F40/295 , G06F40/30 , G06F40/35 , G06N20/00 , H04L51/02 , H04L51/52 , G06N3/044 , G06N3/045
CPC classification number: G10L15/197 , G06F40/186 , G06F40/295 , G06F40/30 , G06F40/35 , G06N20/00 , G10L15/063 , G10L15/26 , H04L51/02 , H04L51/52 , G06N3/044 , G06N3/045 , G10L2015/0631
Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.
-
17.
公开(公告)号:US20230186026A1
公开(公告)日:2023-06-15
申请号:US18065406
申请日:2022-12-13
Applicant: Oracle International Corporation
Inventor: Philip Arthur , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Cong Duy Vu Hoang , Steve Wai-Chun Siu , Nitika Mathur , Gioacchino Tangari , Aashna Devang Kanuga
IPC: G06F40/284 , G06F40/211 , G06F40/40 , G06F16/2452 , G06N20/00
CPC classification number: G06F40/284 , G06F40/211 , G06F40/40 , G06F16/24522 , G06N20/00
Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.
-
公开(公告)号:US20250156649A1
公开(公告)日:2025-05-15
申请号:US18505498
申请日:2023-11-09
Applicant: Oracle International Corporation
Inventor: Gioacchino Tangari , Chang Xu , Nitika Mathur , Philip Arthur , Syed Najam Abbas Zaidi , Aashna Devang Kanuga , Cong Duy Vu Hoang , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson , Vishal Vishnoi
IPC: G06F40/40 , G06F40/211 , G06F40/284
Abstract: Techniques are disclosed herein for improving model robustness on operators and triggering keywords in natural language to a meaning representation language system. The techniques include augmenting an original set of training data for a target robustness bucket by leveraging a combination of two training data generation techniques: (1) modification of existing training examples and (2) synthetic template-based example generation. The resulting set of augmented data examples from the two training data generation techniques are appended to the original set of training data to generate an augmented training data set and the augmented training data set is used to train a machine learning model to generate logical forms for utterances.
-
公开(公告)号:US20250094480A1
公开(公告)日:2025-03-20
申请号:US18885071
申请日:2024-09-13
Applicant: Oracle International Corporation
Inventor: Yingqiong Shi , Charles Woodrow Dickstein , Aashna Devang Kanuga , Xu Zhong , Xin Xu
IPC: G06F16/383 , G06F16/31 , G06F16/33 , G06F40/205
Abstract: Techniques are disclosed herein for generating and using a knowledge base of information extracted from documents. The techniques include accessing a document comprising text and dividing the document into a plurality of chunks of text. The chunks are indexed by storing each chunk mapped to respective identifying metadata including a chunk index for each chunk. A query is received and a chunk relevant to the query is identified. A prompt is formulated including the query, the identified relevant chunk, and a subsequent chunk. The prompt is provided to a language model and output is received from the language model based on the prompt. An answer to the query is returned based on the received output.
-
20.
公开(公告)号:US20250068627A1
公开(公告)日:2025-02-27
申请号:US18616801
申请日:2024-03-26
Applicant: Oracle International Corporation
Inventor: Cong Duy Vu Hoang , Gioacchino Tangari , Stephen Andrew McRitchie , Nitika Mathur , Aashna Devang Kanuga , Steve Wai-Chun Siu , Dalu Guo , Chang Xu , Mark Edward Johnson , Christopher Mark Broadbent , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Chandan Basavaraju , Kenneth Khiaw Hong Eng
IPC: G06F16/2452 , G06F16/2457 , G06F16/28
Abstract: Techniques are disclosed herein for transforming natural language conversations into a visual output. In one aspect, a computer-implement method includes generating an input string by concatenating a natural language utterance with a schema representation comprising a set of entities for visualization actions, generating, by a first encoder of a machine learning model, one or more embeddings of the input string, encoding, by a second encoder of the machine learning model, relations between elements in the schema representation and words in the natural language utterance based on the one or more embeddings, generating, by a grammar-based decoder of the machine learning model and based on the encoded relations and the one or more embeddings, an intermediate logical form that represents at least the query, the one or more visualization actions, or the combination thereof, and generating, based on the intermediate logical form, a command for a computing system.
-
-
-
-
-
-
-
-
-