-
公开(公告)号:US20210374361A1
公开(公告)日:2021-12-02
申请号:US16890097
申请日:2020-06-02
Applicant: Oracle International Corporation
Inventor: Michael Louis Wick , Jean-Baptiste Frederic George Tristan , Adam Craig Pocock , Katherine Silverstein
IPC: G06F40/58
Abstract: A method for training a language model using negative data may include accessing a first training corpus comprising positive training data and accessing a second training corpus comprising negative training data. The method may further include training a first language model using at least the first training corpus, the second training corpus, and a maximum likelihood function. The maximum likelihood function may maximize the likelihood of the first language model predicting the positive training data while minimizing the likelihood of the first language model predicting the negative training data.
-
公开(公告)号:US12288550B2
公开(公告)日:2025-04-29
申请号:US17952116
申请日:2022-09-23
Applicant: Oracle International Corporation
Inventor: Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong
IPC: G10L15/06 , G10L15/183
Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.
-
公开(公告)号:US10552484B2
公开(公告)日:2020-02-04
申请号:US14707283
申请日:2015-05-08
Applicant: Oracle International Corporation
Inventor: Uri Sheffer , Adam Craig Pocock , Brook Stevens , Mashhood Ishaque , Vladimir Zelevinsky , Tristan R. Spaulding
IPC: G06F17/00 , G06F17/30 , G06F16/901 , G06F16/904 , G06F16/33 , G06F16/26 , G06F16/248
Abstract: A system for exploring data receives the data from a database and indexes the data in a server. The system displays one or more selectable datasets from the indexed data, where the selectable datasets include a plurality of attributes. The system receives a selection of one of the plurality of attributes. The system then sorts the one or more attributes by level of interestingness relative to the selected attribute, and displays the sorted attributes.
-
公开(公告)号:US11010768B2
公开(公告)日:2021-05-18
申请号:US14700683
申请日:2015-04-30
Applicant: Oracle International Corporation
Inventor: Pallika Haridas Kanani , Michael Louis Wick , Adam Craig Pocock
IPC: G06Q30/00 , G06F40/284 , G06K9/72 , G06F16/84
Abstract: A system is provided that extracts attribute values. The system receives data including unstructured text from a data store. The system further tokenizes the unstructured text into tokens, where a token is a character of the unstructured text. The system further annotates the tokens with attribute labels, where an attribute label for a token is determined, in least in part, based on a word that the token originates from within the unstructured text. The system further groups the tokens into text segments based on the attribute labels, where a set of tokens that are annotated with an identical attribute label are grouped into a text segment, and where the text segments define attribute values. The system further stores the attribute labels and the attribute values within the data store.
-
公开(公告)号:US09779085B2
公开(公告)日:2017-10-03
申请号:US14863996
申请日:2015-09-24
Applicant: Oracle International Corporation
Inventor: Michael Louis Wick , Pallika Haridas Kanani , Adam Craig Pocock
CPC classification number: G06F17/2818 , G06F17/2735
Abstract: A natural language processing (“NLP”) manager is provided that manages NLP model training. An unlabeled corpus of multilingual documents is provided that span a plurality of target languages. A multilingual embedding is trained on the corpus of multilingual documents as input training data, the multilingual embedding being generalized across the target languages by modifying the input training data and/or transforming multilingual dictionaries into constraints in an underlying optimization problem. An NLP model is trained on training data for a first language of the target languages, using word embeddings of the trained multilingual embedding as features. The trained NLP model is applied for data from a second of the target languages, the first and second languages being different.
-
公开(公告)号:US20230098783A1
公开(公告)日:2023-03-30
申请号:US17952116
申请日:2022-09-23
Applicant: Oracle International Corporation
Inventor: Poorya Zaremoodi , Cong Duy Vu Hoang , Duy Vu , Dai Hoang Tran , Budhaditya Saha , Nagaraj N. Bhat , Thanh Tien Vu , Tuyen Quang Pham , Adam Craig Pocock , Katherine Silverstein , Srinivasa Phani Kumar Gadde , Vishal Vishnoi , Mark Edward Johnson , Thanh Long Duong
IPC: G10L15/06 , G10L15/183
Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.
-
公开(公告)号:US10387494B2
公开(公告)日:2019-08-20
申请号:US14678218
申请日:2015-04-03
Applicant: Oracle International Corporation
Inventor: Uri Sheffer , Adam Craig Pocock , Brook Stevens , Mashhood Ishaque , Vladimir Zelevinsky , Tristan R. Spaulding
IPC: G06F17/30 , G06F16/901 , G06F16/904 , G06F16/33 , G06F16/26 , G06F16/248
Abstract: A system for exploring data receives the data from a database and indexes the data in a server. The system displays one or more selectable datasets from the indexed data, where the selected dataset includes one or more attributes. The system then sorts the one or more attributes by level of interestingness and displays the sorted attributes.
-
公开(公告)号:US20170024680A1
公开(公告)日:2017-01-26
申请号:US14804496
申请日:2015-07-21
Applicant: Oracle International Corporation
Inventor: Dana Allison , Denis Gulsen , Victor Chung-Wai Chan , Adam Craig Pocock , Pallika Kanani , David Greenberg
CPC classification number: G06Q10/063112 , G06F16/24578 , G06Q30/016
Abstract: Embodiments described herein provide an efficient multi-dimensional routing algorithm that takes into account decision factors including but not limited to skills of the agents, a channel to be used for a particular contact, personal preferences and other contact specific information, a balance between inbound and outbound contacts, the relative expense of agents for a particular contact, etc. This routing algorithm can be adapted to handle mandatory conditions as well as soft conditions. Each of the various possible conditions can be weighted by the entity implementing the contact center based on a relative importance of the factor to that entity. Embodiments can also include a set of analytics that provides insight into the correlation between the decision factors and desired outcomes which can be used, for example, for proper tuning of the algorithm based on an adjustment of the weight applied to these various factors.
Abstract translation: 本文描述的实施例提供了一种有效的多维路由算法,其考虑了决定因素,包括但不限于代理人的技能,用于特定联系人的信道,个人偏好和其他联系人特定信息,入站和 出站联系人,特定联系人的代理人的相对费用等。该路由算法可以适应于处理强制条件以及软条件。 各个可能的条件中的每一个都可以由实体联络中心的实体根据因素对该实体的相对重要性加权。 实施例还可以包括一组分析,其提供对决策因素和期望结果之间的相关性的了解,可以使用,例如,基于对这些各种因素的权重的调整来适当地调整算法。
-
-
-
-
-
-
-