-
公开(公告)号:US20250094464A1
公开(公告)日:2025-03-20
申请号:US18885060
申请日:2024-09-13
Applicant: Oracle International Corporation
Inventor: Xu Zhong , Aashna Devang Kanuga
IPC: G06F16/33
Abstract: Techniques are disclosed herein for selecting document chunks that are most relevant to a query. The techniques include receiving a query and comparing a plurality of stored text passages to the query using a first similarity metric. Based on the comparison, a subset of the plurality of stored text passages that are most similar to the query are selected. A plurality of sentences from the subset of the plurality of stored text passages are identified. The identified sentences are ranked based on the query and a second similarity metric. A subset of the sentences are selected based on the ranking. The subset of the sentences or a derivative thereof are output in response to the query.
-
公开(公告)号:US20220229991A1
公开(公告)日:2022-07-21
申请号:US17580535
申请日:2022-01-20
Applicant: Oracle International Corporation
Inventor: Thanh Long Duong , Vishal Vishnoi , Mark Edward Johnson , Elias Luqman Jalaluddin , Tuyen Quang Pham , Cong Duy Vu Hoang , Poorya Zaremoodi , Srinivasa Phani Kumar Gadde , Aashna Devang Kanuga , Zikai Li , Yuanxu Wu
IPC: G06F40/289 , G06F40/166 , G06N3/08
Abstract: Techniques are disclosed for systems including techniques for multi-feature balancing for natural langue processors. In an embodiment, a method includes receiving a natural language query to be processed by a machine learning model, the machine learning model utilizing a dataset of natural language phrases for processing natural language queries, determining, based on the machine learning model and the natural language query, a feature dropout value, generating, and based on the natural language query, one or more contextual features and one or more expressional features that may be input to the machine learning model, modifying at least one or the one or more contextual features and the one or more expressional features based on the feature dropout value to generate a set of input features for the machine learning model, and processing the set of input features to cause generating an output dataset for corresponding to the natural language query.
-
公开(公告)号:US20240419910A1
公开(公告)日:2024-12-19
申请号:US18819441
申请日:2024-08-29
Applicant: Oracle International Corporation
Inventor: Thanh Long Duong , Vishal Vishnoi , Mark Edward Johnson , Elias Luqman Jalaluddin , Tuyen Quang Pham , Cong Duy Vu Hoang , Poorya Zaremoodi , Srinivasa Phani Kumar Gadde , Aashna Devang Kanuga , Zikai Li , Yuanxu Wu
IPC: G06F40/289 , G06F40/166 , G06F40/205 , G06F40/263 , G06F40/279 , G06F40/295 , G06N3/08 , H04L51/02
Abstract: A method includes receiving an indication of a first coverage value corresponding to a desired overlap between a dataset of natural language phrases and a training dataset for training a machine learning model; determining a second coverage value corresponding to a measured overlap between the dataset of natural language phrases and the training dataset; determining a coverage delta value based on a comparison between the first coverage value and the second coverage value; modifying, based on the coverage delta value, the dataset of natural language phrases; and processing, utilizing a machine learning model including the modified dataset of natural language phrases, an input dataset including a set of input features. The machine learning model processes the input dataset based at least in part on the dataset of natural language phrases to generate an output dataset.
-
公开(公告)号:US20240232187A9
公开(公告)日:2024-07-11
申请号:US18321144
申请日:2023-05-22
Applicant: Oracle International Corporation
Inventor: Chang Xu , Poorya Zaremoodi , Cong Duy Vu Hoang , Nitika Mathur , Philip Arthur , Steve Wai-Chun Siu , Aashna Devang Kanuga , Gioacchino Tangari , Mark Edward Johnson , Thanh Long Duong , Vishal Vishnoi , Stephen Andrew McRitchie , Christopher Mark Broadbent
IPC: G06F16/2452 , G06F40/211 , G06F40/30
CPC classification number: G06F16/24522 , G06F40/211 , G06F40/30
Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
-
公开(公告)号:US20240013780A1
公开(公告)日:2024-01-11
申请号:US18471491
申请日:2023-09-21
Applicant: Oracle International Corporation
Inventor: Srinivasa Phani Kumar Gadde , Yuanxu Wu , Aashna Devang Kanuga , Elias Luqman Jalaluddin , Vishal Vishnoi , Mark Edward Johnson
IPC: G10L15/197 , G10L15/06 , G10L15/26 , H04L51/02 , H04L51/52 , G06F40/186 , G06F40/295 , G06F40/30 , G06N20/00 , G06F40/35
CPC classification number: G10L15/197 , G10L15/063 , G10L15/26 , H04L51/02 , H04L51/52 , G06F40/186 , G06F40/295 , G06F40/30 , G06N20/00 , G06F40/35 , G10L2015/0631 , G06N3/044
Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.
-
公开(公告)号:US20230186161A1
公开(公告)日:2023-06-15
申请号:US18065422
申请日:2022-12-13
Applicant: Oracle International Corporation
Inventor: Philip Arthur , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Cong Duy Vu Hoang , Steve Wai-Chun Siu , Nitika Mathur , Gioacchino Tangari , Aashna Devang Kanuga
IPC: G06N20/00 , G06F40/58 , G06F40/284 , G06F40/237
CPC classification number: G06N20/00 , G06F40/58 , G06F40/284 , G06F40/237 , G06F40/35
Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.
-
公开(公告)号:US20250094725A1
公开(公告)日:2025-03-20
申请号:US18624472
申请日:2024-04-02
Applicant: Oracle International Corporation
Inventor: Vishal Vishnoi , Xin Xu , Diego Andres Cornejo Barra , Ying Xu , Yakupitiyage Don Thanuja Samodhve Dharmasiri , Aashna Devang Kanuga , Srinivasa Phani Kumar Gadde , Thanh Long Duong , Mark Edward Johnson
IPC: G06F40/35 , G06F16/332
Abstract: Techniques are disclosed herein for implementing digital assistants using generative artificial intelligence. An input prompt comprising a natural language utterance and candidate agents and associated actions can be constructed. An execution plan can be generated using a first generative artificial model based on the input prompt. The execution plan can be executed to perform actions included in the execution plan using agents indicated by the execution plan. A response to the natural language utterance can be generated by a second generative artificial intelligence model using one or more outputs from executing the execution plan.
-
公开(公告)号:US20240061833A1
公开(公告)日:2024-02-22
申请号:US18218385
申请日:2023-07-05
Applicant: Oracle International Corporation
Inventor: Gioacchino Tangari , Nitika Mathur , Philip Arthur , Cong Duy Vu Hoang , Aashna Devang Kanuga , Steve Wai-Chun Siu , Syed Najam Abbas Zaidi , Poorya Zaremoodi , Thanh Long Duong , Mark Edward Johnson
IPC: G06F16/2452 , G06F16/242 , G06F40/247 , G06F40/284
CPC classification number: G06F16/24522 , G06F16/243 , G06F40/247 , G06F40/284
Abstract: Techniques are disclosed for augmenting training data for training a machine learning model to generate database queries. Training data comprising a first training example comprising a first natural language utterance, a logical form for the first natural language utterance, and associated first metadata is obtained. From the first training example, a template utterance is generated. A second natural language utterance is generated by filling slots in the template utterance based on a database schema and database values. Updated metadata is produced based on the first metadata and the second natural language utterance. A second training example is generated, comprising the second natural language utterance, the logical form for the first natural language utterance, and the updated metadata. The training data is augmented by adding the second training example. A machine learning model is trained to generate a database query comprising the database operation using the augmented training data set.
-
9.
公开(公告)号:US20230186025A1
公开(公告)日:2023-06-15
申请号:US18065387
申请日:2022-12-13
Applicant: Oracle International Corporation
Inventor: Jae Min John , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong , Srinivasa Phani Kumar Gadde , Balakota Srinivas Vinnakota , Shivashankar Subramanian , Cong Duy Vu Hoang , Yakupitiyage Don Thanuja Samodhye Dharmasiri , Nitika Mathur , Aashna Devang Kanuga , Philip Arthur , Gioacchino Tangari , Steve Wai-Chun Siu
IPC: G06F40/284 , G06F40/295 , G06F40/42
CPC classification number: G06F40/284 , G06F40/295 , G06F40/42
Abstract: Techniques for preprocessing data assets to be used in a natural language to logical form model based on scalable search and content-based schema linking. In one particular aspect, a method includes accessing an utterance, classifying named entities within the utterance into predefined classes, searching value lists within the database schema using tokens from the utterance to identify and output value matches including: (i) any value within the value lists that matches a token from the utterance and (ii) any attribute associated with a matching value, generating a data structure by organizing and storing: (i) each of the named entities and an assigned class for each of the named entities, (ii) each of the value matches and the token matching each of the value matches, and (iii) the utterance, in a predefined format for the data structure, and outputting the data structure.
-
公开(公告)号:US20210390951A1
公开(公告)日:2021-12-16
申请号:US17345288
申请日:2021-06-11
Applicant: Oracle International Corporation
Inventor: Srinivasa Phani Kumar Gadde , Yuanxu Wu , Aashna Devang Kanuga , Elias Luqman Jalaluddin , Vishal Vishnoi , Mark Edward Johnson
IPC: G10L15/197 , H04L12/58 , G10L15/26 , G10L15/06
Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.
-
-
-
-
-
-
-
-
-