Patent search ap:("Oracle International Corporation") AND inv:"Michael Louis Wick" Page 1

1.

发明申请
Enforcing Fairness on Unlabeled Data to Improve Modeling Performance 有权

公开(公告)号：US20250068979A1

公开(公告)日：2025-02-27

申请号：US18942116

申请日：2024-11-08

Applicant: Oracle International Corporation

Inventor： Michael Louis Wick , Swetasudha Panda , Jean-Baptiste Frederic George Tristan

IPC: G06N20/00 , G06N3/088

Abstract: Fairness of a trained classifier may be ensured by generating a data set for training, the data set generated using input data points of a feature space including multiple dimensions and according to different parameters including an amount of label bias, a control for discrepancy between rarity of features, and an amount of selection bias. Unlabeled data points of the input data comprising unobserved ground truths are labeled according to the amount of label bias and the input data sampled according to the amount of selection bias and the control for the discrepancy between the rarity of features. The classifier is then trained using the sampled and labeled data points as well as additional 10 unlabeled data points. The trained classifier is then usable to determine unbiased classifications of one or more labels for one or more other data sets.

2.

发明授权
Similarity analysis using enhanced MinHash 有权

公开(公告)号：US11921687B2

公开(公告)日：2024-03-05

申请号：US16436770

申请日：2019-06-10

Applicant: Oracle International Corporation

Inventor： Michael Louis Wick , Jean-Baptiste Frederic George Tristan , Swetasudha Panda

IPC: G06F16/22 , G06F17/18 , G06F18/22 , G06F18/231

CPC classification number: G06F16/2228 , G06F17/18 , G06F18/22 , G06F18/231

Abstract: A first set and a second set are identified as operands for a set operation of a similarity analysis task iteration. Using respective minimum hash information arrays and contributor count arrays of the two sets, a minimum hash information array and contributor count array of a derived set resulting from the set operation is generated. An entry in the contributor count array of the derived set indicates the number of child sets of the derived set that meet a criterion with respect to a corresponding entry in the minimum hash information array of the derived set. The generated minimum hash information array and the contributor count array are stored as part of input for a subsequent iteration. After a termination criterion of the task is met, output of the task is stored.

3.

发明授权
Control system for learning to rank fairness 有权

公开(公告)号：US11416500B2

公开(公告)日：2022-08-16

申请号：US16781961

申请日：2020-02-04

Applicant: Oracle International Corporation

Inventor： Jean-Baptiste Frederic George Tristan , Michael Louis Wick , Swetasudha Panda

IPC: G06F16/2457 , G06N20/00 , G06N20/20 , G06F17/18 , G06K9/62

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

4.

发明申请
Enforcing Fairness on Unlabeled Data to Improve Modeling Performance 审中-公开

公开(公告)号：US20200372406A1

公开(公告)日：2020-11-26

申请号：US16781945

申请日：2020-02-04

Applicant: Oracle International Corporation

Inventor： Michael Louis Wick , Swetasudha Panda , Jean-Baptiste Frederic George Tristan

IPC: G06N20/00

Abstract: Fairness of a trained classifier may be ensured by generating a data set for training, the data set generated using input data points of a feature space including multiple dimensions and according to different parameters including an amount of label bias, a control for discrepancy between rarity of features, and an amount of selection bias. Unlabeled data points of the input data comprising unobserved ground truths are labeled according to the amount of label bias and the input data sampled according to the amount of selection bias and the control for the discrepancy between the rarity of features. The classifier is then trained using the sampled and labeled data points as well as additional unlabeled data points. The trained classifier is then usable to determine unbiased classifications of one or more labels for one or more other data sets.

5.

发明授权
Enforcing fairness on unlabeled data to improve modeling performance 有权

公开(公告)号：US12175344B2

公开(公告)日：2024-12-24

申请号：US18453929

申请日：2023-08-22

Applicant: Oracle International Corporation

Inventor： Michael Louis Wick , Swetasudha Panda , Jean-Baptiste Frederic George Tristan

IPC: G06N20/00 , G06N3/088

Abstract: Fairness of a trained classifier may be ensured by generating a data set for training, the data set generated using input data points of a feature space including multiple dimensions and according to different parameters including an amount of label bias, a control for discrepancy between rarity of features, and an amount of selection bias. Unlabeled data points of the input data comprising unobserved ground truths are labeled according to the amount of label bias and the input data sampled according to the amount of selection bias and the control for the discrepancy between the rarity of features. The classifier is then trained using the sampled and labeled data points as well as additional unlabeled data points. The trained classifier is then usable to determine unbiased classifications of one or more labels for one or more other data sets.

6.

发明申请
Debiasing Pre-trained Sentence Encoders With Probabilistic Dropouts 有权

公开(公告)号：US20240419900A1

公开(公告)日：2024-12-19

申请号：US18817147

申请日：2024-08-27

Applicant: Oracle International Corporation

Inventor： Swetasudha Panda , Ariel Kobren , Michael Louis Wick , Stephen Green

IPC: G06F40/279 , G06N20/00

Abstract: Debiasing pre-trained sentence encoders with probabilistic dropouts may be performed by various systems, services, or applications. A sentence may be received, where the words of the sentence may be provided as tokens to an encoder of a machine learning model. A token-wise correlation using semantic orientation may be determined to determine a bias score for the tokens in the input sentence. A probability of dropout that for tokens in the input sentence may be determined from the bias scores. The machine learning model may be trained or tuned based on the probabilities of dropout for the tokens in the input sentence.

7.

发明申请
AUGMENTING DATA SETS FOR MACHINE LEARNING MODELS 有权

公开(公告)号：US20230032208A1

公开(公告)日：2023-02-02

申请号：US17389900

申请日：2021-07-30

Applicant: Oracle International Corporation

Inventor： Ariel Gedaliah Kobren , Naveen Jafer Nizar , Michael Louis Wick , Swetasudha Panda

IPC: G06N20/00 , G06K9/62 , G06F40/247

Abstract: Techniques are disclosed for augmenting data sets used for training machine learning models and for generating predictions by trained machine learning models. These techniques may increase a number (and diversity) of examples within an initial training dataset of sentences by extracting a subset of words from the existing training dataset of sentences. The extracted subset includes no stopwords and fewer content words than found in the initial training dataset. The remaining words may be re-ordered. Using the extracted and re-ordered subset of words, the dataset generation model produces a second set of sentences that are different from the first set. The second set of sentences may be used to increase a number of examples in classes with few examples.

8.

发明申请
Control System for Learning to Rank Fairness 有权

公开(公告)号：US20220382768A1

公开(公告)日：2022-12-01

申请号：US17819611

申请日：2022-08-12

Applicant: Oracle International Corporation

Inventor： Jean-Baptiste Frederic George Tristan , Michael Louis Wick , Swetasudha Panda

IPC: G06F16/2457 , G06N20/00 , G06N20/20 , G06F17/18 , G06K9/62 , G06V20/20 , G02B27/01 , G06T19/00 , G09G3/00

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

9.

发明申请
Debiasing Pre-trained Sentence Encoders With Probabilistic Dropouts 有权

公开(公告)号：US20220245339A1

公开(公告)日：2022-08-04

申请号：US17589662

申请日：2022-01-31

Applicant: Oracle International Corporation

Inventor： Swetasudha Panda , Ariel Kobren , Michael Louis Wick , Stephen Green

IPC: G06F40/279 , G06N20/00

Abstract: Debiasing pre-trained sentence encoders with probabilistic dropouts may be performed by various systems, services, or applications. A sentence may be received, where the words of the sentence may be provided as tokens to an encoder of a machine learning model. A token-wise correlation using semantic orientation may be determined to determine a bias score for the tokens in the input sentence. A probability of dropout that for tokens in the input sentence may be determined from the bias scores. The machine learning model may be trained or tuned based on the probabilities of dropout for the tokens in the input sentence.

10.

发明授权
Named entity recognition and entity linking joint training 有权

公开(公告)号：US10410139B2

公开(公告)日：2019-09-10

申请号：US15168309

申请日：2016-05-31

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventor： Pallika Haridas Kanani , Michael Louis Wick , Katherine Silverstein

IPC: G06F15/18 , G06N20/00 , G06N7/00 , G06F16/93 , G06F16/248 , G06F17/27

Abstract: A system that performs natural language processing receives a text corpus that includes a plurality of documents and receives a knowledge base. The system generates a set of document n-grams from the text corpus and considers all n-grams as candidate mentions. The system, for each candidate mention, queries the knowledge base and in response retrieves results. From the results retrieved by the queries, the system generates a search space and generates a joint model from the search space.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification