Patent search ap:("Oracle International Corporation") AND inv:"Yasha Pushak" Page 3

21.

发明公开
UNIFY95: META-LEARNING CONTAMINATION THRESHOLDS FROM UNIFIED ANOMALY SCORES 审中-公开

公开(公告)号：US20240095580A1

公开(公告)日：2024-03-21

申请号：US17994530

申请日：2022-11-28

Applicant: Oracle International Corporation

Inventor： Yasha Pushak , Hesam Fathi Moghadam , Anatoly Yakovlev , Robert David Hopkins, II

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: Herein is a universal anomaly threshold based on several labeled datasets and transformation of anomaly scores from one or more anomaly detectors. In an embodiment, a computer meta-learns from each anomaly detection algorithm and each labeled dataset as follows. A respective anomaly detector based on the anomaly detection algorithm is trained based on the dataset. The anomaly detector infers respective anomaly scores for tuples in the dataset. The following are ensured in the anomaly scores from the anomaly detector: i) regularity that an anomaly score of zero cannot indicate an anomaly and ii) normality that an inclusive range of zero to one contains the anomaly scores from the anomaly detector. A respective anomaly threshold is calculated for the anomaly scores from the anomaly detector. After all meta-learning, a universal anomaly threshold is calculated as an average of the anomaly thresholds. An anomaly is detected based on the universal anomaly threshold.

22.

发明申请
AUTOMATED DATASET DRIFT DETECTION 有权

公开(公告)号：US20230139718A1

公开(公告)日：2023-05-04

申请号：US17513760

申请日：2021-10-28

Applicant: Oracle International Corporation

Inventor： Mojtaba Valipour , Yasha Pushak , Robert Harlow , Hesam Fathi Moghadam , Sungpack Hong , Hassan Chafi

IPC: G06N20/00 , G06F16/28

Abstract: Herein are acceleration and increased reliability based on classification and scoring techniques for machine learning that compare two similar datasets of different ages to detect data drift without a predefined drift threshold. Various subsets are randomly sampled from the datasets. The subsets are combined in various ways to generate subsets of various age mixtures. In an embodiment, ages are permuted and drift is detected based on whether or not fitness scores indicate that an age binary classifier is confused. In an embodiment, an anomaly detector measures outlier scores of two subsets of different age mixtures. Drift is detected when the outlier scores diverge. In a two-arm bandit embodiment, iterations randomly alternate between both datasets based on respective probabilities that are adjusted by a bandit reward based on outlier scores from an anomaly detector. Drift is detected based on the probability of the younger dataset.

23.

发明申请
EFFICIENT AND ACCURATE REGIONAL EXPLANATION TECHNIQUE FOR NLP MODELS 有权

公开(公告)号：US20220309360A1

公开(公告)日：2022-09-29

申请号：US17212163

申请日：2021-03-25

Applicant: Oracle International Corporation

Inventor： Zahra Zohrevand , Tayler Hetherington , Karoon Rashedi Nia , Yasha Pushak , Sanjay Jinturkar , Nipun Agarwal

IPC: G06N5/04 , G06N20/00 , G06F40/20

Abstract: Herein are techniques for topic modeling and content perturbation that provide machine learning (ML) explainability (MLX) for natural language processing (NLP). A computer hosts an ML model that infers an original inference for each of many text documents that contain many distinct terms. To each text document (TD) is assigned, based on terms in the TD, a topic that contains a subset of the distinct terms. In a perturbed copy of each TD, a perturbed subset of the distinct terms is replaced. For the perturbed copy of each TD, the ML model infers a perturbed inference. For TDs of a topic, the computer detects that a difference between original inferences of the TDs of the topic and perturbed inferences of the TDs of the topic exceeds a threshold. Based on terms in the TDs of the topic, the topic is replaced with multiple, finer-grained new topics. After sufficient topic modeling, a regional explanation of the ML model is generated.

24.

发明申请
FAST, APPROXIMATE CONDITIONAL DISTRIBUTION SAMPLING 有权

公开(公告)号：US20220261400A1

公开(公告)日：2022-08-18

申请号：US17179265

申请日：2021-02-18

Applicant: Oracle International Corporation

Inventor： Yasha Pushak , Tayler Hetherington , Karoon Rashedi Nia , Zahra Zohrevand , Sanjay Jinturkar , Nipun Agarwal

IPC: G06F16/2458 , G06N20/00

Abstract: Techniques are described for fast approximate conditional sampling by randomly sampling a dataset and then performing a nearest neighbor search on the pre-sampled dataset to reduce the data over which the nearest neighbor search must be performed and, according to an embodiment, to effectively reduce the number of nearest neighbors that are to be found within the random sample. Furthermore, KD-Tree-based stratified sampling is used to generate a representative sample of a dataset. KD-Tree-based stratified sampling may be used to identify the random sample for fast approximate conditional sampling, which reduces variance in the resulting data sample. As such, using KD-Tree-based stratified sampling to generate the random sample for fast approximate conditional sampling ensures that any nearest neighbor selected, for a target data instance, from the random sample is likely to be among the nearest neighbors of the target data instance within the unsampled dataset.

25.

发明申请
POST-HOC EXPLANATION OF MACHINE LEARNING MODELS USING GENERATIVE ADVERSARIAL NETWORKS 有权

公开(公告)号：US20220198277A1

公开(公告)日：2022-06-23

申请号：US17131387

申请日：2020-12-22

Applicant: Oracle International Corporation

Inventor： Karoon Rashedi Nia , Tayler Hetherington , Zahra Zohrevand , Yasha Pushak , Sanjay Jinturkar , Nipun Agarwal

IPC: G06N3/08 , G06N3/04

Abstract: Herein are generative adversarial networks to ensure realistic local samples and surrogate models to provide machine learning (ML) explainability (MLX). Based on many features, an embodiment trains an ML model. The ML model inferences an original inference for original feature values respectively for many features. Based on the same features, a generator model is trained to generate realistic local samples that are distinct combinations of feature values for the features. A surrogate model is trained based on the generator model and based on the original inference by the ML model and/or the original feature values that the original inference is based on. Based on the surrogate model, the ML model is explained. The local samples may be weighted based on semantic similarity to the original feature values, which may facilitate training the surrogate model and/or ranking the relative importance of the features. Local sample weighting may be based on populating a random forest with the local samples.

26.

发明申请
GENERALIZED EXPECTATION MAXIMIZATION 有权

公开(公告)号：US20220027777A1

公开(公告)日：2022-01-27

申请号：US16935313

申请日：2020-07-22

Applicant: Oracle International Corporation

Inventor： Felix Schmidt , Yasha Pushak , Stuart Wray

IPC: G06N20/00 , G06F16/901 , G06N5/04

Abstract: Techniques are described that extend supervised machine-learning algorithms for use with semi-supervised training. Random labels are assigned to unlabeled training data, and the data is split into k partitions. During a label-training iteration, each of these k partitions is combined with the labeled training data, and the combination is used train a single instance of the machine-learning model. Each of these trained models are then used to predict labels for data points in the k−1 partitions of previously-unlabeled training data that were not used to train of the model. Thus, every data point in the previously-unlabeled training data obtains k−1 predicted labels. For each data point, these labels are aggregated to obtain a composite label prediction for the data point. After the labels are determined via one or more label-training iterations, a machine-learning model is trained on data with the resulting composite label predictions and on the labeled data set.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification