专利检索 ap:("International Business Machines Corporation") AND inv:"Steven George Barbee" 第 1 页

1.

发明申请
IDENTIFYING OPTIMAL WEIGHTS TO IMPROVE PREDICTION ACCURACY IN MACHINE LEARNING TECHNIQUES 有权

公开(公告)号：US20210150407A1

公开(公告)日：2021-05-20

申请号：US16684396

申请日：2019-11-14

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Steven George Barbee , Xue Ying Zhang , Ji Hui Yang

IPC分类号： G06N20/00 , G06N5/02

摘要： A computer-implemented method, system and computer program product for improving prediction accuracy in machine learning techniques. A teacher model is constructed, where the teacher model generates a weight for each data case. The current student model is then trained using training data and the weights generated by the teacher model. After training the current student model, the current student model generates state features, which are used by the teacher model to generate new weights. A candidate student model is then trained using training data and these new weights. A reward is generated by comparing the current student model with the candidate student model using training and testing data, which is used to update the teacher model if a stopping rule has not been satisfied. Upon a stopping rule being satisfied, the weights generated by the teacher model are deemed to be the “optimal” weights which are returned to the user.

2.

发明申请
Data Partitioning with Quality Evaluation 有权

公开(公告)号：US20210142213A1

公开(公告)日：2021-05-13

申请号：US16681920

申请日：2019-11-13

申请人： International Business Machines Corporation

发明人： Si Er Han , Steven George Barbee , Jing Xu , Ji Hui Yang , Xue Ying Zhang

IPC分类号： G06N20/00 , G06N5/04 , G06F16/28 , G06F16/2457

摘要： Evaluating data partition quality is provided. A historical data set is partitioned into a specified number of partitions. A quality of each partition in the specified number of partitions is evaluated by measuring a distribution similarity between variables from each data subset in a respective partition and the historical data set. A highest-quality partition in the specified number of partitions is recommended to build a supervised machine learning model based on the highest-quality partition having a highest variable distribution similarity measure with the historical data set.

3.

发明授权
Efficient execution of a decision tree 有权

公开(公告)号：US12093838B2

公开(公告)日：2024-09-17

申请号：US17027688

申请日：2020-09-21

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Xue Ying Zhang , Steven George Barbee , Ji Hui Yang

IPC分类号： G06N5/01 , G06F17/18 , G06F18/22 , G06F18/2413 , G06N7/01

CPC分类号： G06N5/01 , G06F17/18 , G06F18/22 , G06F18/2413 , G06N7/01

摘要： Embodiments of the present disclosure relate to a method, system, and computer program product for efficient execution of a decision tree. According to the method, respective target values of a plurality of attributes of a target entity are obtained. Representations of a plurality of leaf nodes of a decision tree are obtained. Each of the representations indicates respective statistic values of a plurality of attributes of historical entities and a statistic prediction result determined from historical prediction results output at a respective one of the plurality of leaf nodes for the historical entities. Distance measures between the target entity and the plurality of leaf nodes are determined based on the target values and the statistic values indicated by the representations of the plurality of leaf nodes. A target prediction result for the target entity is determined based on the distance measures and the statistic prediction results of the historical entities.

4.

发明授权
Identifying optimal weights to improve prediction accuracy in machine learning techniques 有权

公开(公告)号：US11443235B2

公开(公告)日：2022-09-13

申请号：US16684396

申请日：2019-11-14

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Steven George Barbee , Xue Ying Zhang , Ji Hui Yang

IPC分类号： G06F15/16 , G06F9/54 , H04L29/06 , G06N20/00 , G06N5/02

摘要： A computer-implemented method, system and computer program product for improving prediction accuracy in machine learning techniques. A teacher model is constructed, where the teacher model generates a weight for each data case. The current student model is then trained using training data and the weights generated by the teacher model. After training the current student model, the current student model generates state features, which are used by the teacher model to generate new weights. A candidate student model is then trained using training data and these new weights. A reward is generated by comparing the current student model with the candidate student model using training and testing data, which is used to update the teacher model if a stopping rule has not been satisfied. Upon a stopping rule being satisfied, the weights generated by the teacher model are deemed to be the “optimal” weights which are returned to the user.

5.

发明申请
DATA PARTITIONING WITH NEURAL NETWORK 有权

公开(公告)号：US20220156572A1

公开(公告)日：2022-05-19

申请号：US16950017

申请日：2020-11-17

申请人： International Business Machines Corporation

发明人： Si Er Han , Jing Xu , Xue Ying Zhang , Ji Hui Yang , Steven George Barbee

IPC分类号： G06N3/08 , G06F16/27

摘要： A computer-implemented method, system and computer program product for processing a data set is provided. In this method, an original data set including a plurality of data records is obtained. Each data record in the original data set has values of a first number of features. A representative data set having the plurality of representative data records is determined. Each representative data record has values of a second number of representatives. The second number of representatives are obtained by training an autoencoder neutral network with values of the first number of features as inputs, and the second number is smaller than the first number. The plurality of representative data records is segmented into two or more clusters based on the values of the second number of representatives. The representative data records in the two or more clusters are partitioned to form a predefined number of representative data subsets.

6.

发明申请
EFFICIENT EXECUTION OF A DECISION TREE 有权

公开(公告)号：US20220092437A1

公开(公告)日：2022-03-24

申请号：US17027688

申请日：2020-09-21

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Xue Ying Zhang , Steven George Barbee , Ji Hui Yang

IPC分类号： G06N5/00 , G06K9/62 , G06N7/00 , G06F17/18

摘要： Embodiments of the present disclosure relate to a method, system, and computer program product for efficient execution of a decision tree. According to the method, respective target values of a plurality of attributes of a target entity are obtained. Representations of a plurality of leaf nodes of a decision tree are obtained. Each of the representations indicates respective statistic values of a plurality of attributes of historical entities and a statistic prediction result determined from historical prediction results output at a respective one of the plurality of leaf nodes for the historical entities. Distance measures between the target entity and the plurality of leaf nodes are determined based on the target values and the statistic values indicated by the representations of the plurality of leaf nodes. A target prediction result for the target entity is determined based on the distance measures and the statistic prediction results of the historical entities.

7.

发明申请
Feature Generation for Training Data Sets Based on Unlabeled Data 有权

公开(公告)号：US20230073137A1

公开(公告)日：2023-03-09

申请号：US17447258

申请日：2021-09-09

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Xue Ying Zhang , Steven George Barbee , Ji Hui Yang

IPC分类号： G06N20/00 , G06K9/62

摘要： A computer implemented method for machine learning model training. A number of processor units creates a cluster model comprising labeled samples and unlabeled samples. The number of processor units identifies cluster information for the labeled samples from the cluster model. The number of processor units adds a set of new features to a set of original features for the labeled samples using the cluster information to form an extended set of features for the labeled samples, wherein the labeled samples with the set of original features and the set of new features form a training data set for training a machine learning model.

8.

发明申请
FRAUD SUSPECTS DETECTION AND VISUALIZATION 有权

公开(公告)号：US20230083118A1

公开(公告)日：2023-03-16

申请号：US17476401

申请日：2021-09-15

申请人： International Business Machines Corporation

发明人： Steven George Barbee , Si Er Han , Jing Xu , Ji Hui Yang , Xue Ying Zhang

IPC分类号： G06Q20/40 , G06K9/62 , G06F17/18

摘要： An approach is provided in which the approach generates anomaly score variables using multiple unsupervised models based on a set of data records. The approach normalizes the anomaly score variables into multiple normalized variables, and constructs at least one interaction based on a first one of the normalized variables and a second one of the normalized variables. The first normalized variable corresponds to a first one of the anomaly score variables and the second normalized variable corresponds to a second one of the anomaly score variables. The approach detects a set of anomalies based on the at least one interaction and transmits the set of anomalies to a user.

9.

发明授权
Uplift modeling 有权

公开(公告)号：US11562400B1

公开(公告)日：2023-01-24

申请号：US17483328

申请日：2021-09-23

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Xue Ying Zhang , Steven George Barbee , Ji Hui Yang

IPC分类号： G06Q30/02 , G06K9/62 , G06N20/00

摘要： A method includes training a plurality of different types of machine learning models using a training dataset to produce a set of trained machine learning models and determining a lift of each trained machine learning model in the set of trained machine learning models using a validation dataset. The method also includes selecting a trained machine learning model from the set of trained machine learning models that has a highest lift of the set of trained machine learning models and predicting a likelihood that a person will perform an action by applying the selected trained machine learning model to data about the person.

10.

发明申请
IDENTIFYING OPTIMAL WEIGHTS TO IMPROVE PREDICTION ACCURACY IN MACHINE LEARNING TECHNIQUES 有权

公开(公告)号：US20220292401A1

公开(公告)日：2022-09-15

申请号：US17827495

申请日：2022-05-27

申请人： International Business Machines Corporation

发明人： Jing Xu , Si Er Han , Steven George Barbee , Xue Ying Zhang , Ji Hui Yang

IPC分类号： G06N20/00 , G06N5/02

摘要： A computer-implemented method, system and computer program product for improving prediction accuracy in machine learning techniques. A teacher model is constructed, where the teacher model generates a weight for each data case. The current student model is then trained using training data and the weights generated by the teacher model. After training the current student model, the current student model generates state features, which are used by the teacher model to generate new weights. A candidate student model is then trained using training data and these new weights. A reward is generated by comparing the current student model with the candidate student model using training and testing data, which is used to update the teacher model if a stopping rule has not been satisfied. Upon a stopping rule being satisfied, the weights generated by the teacher model are deemed to be the “optimal” weights which are returned to the user.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类