专利检索 ap:("International Business Machines Corporation") AND inv:"Bryant Chen" 第 1 页

1.

发明授权
Automatically determining whether an activation cluster contains poisonous data 有权

公开(公告)号：US11487963B2

公开(公告)日：2022-11-01

申请号：US16571321

申请日：2019-09-16

申请人： International Business Machines Corporation

发明人： Nathalie Baracaldo Angel , Bryant Chen , Biplav Srivastava , Heiko H. Ludwig

IPC分类号： G06K9/62 , G06N3/04 , G06N3/08

摘要： Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained network using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes analyzing, for each cluster, a distance of a median of the activations therein to medians of the activations in the labels.

2.

发明申请
Automatically Determining Whether an Activation Cluster Contains Poisonous Data 有权

公开(公告)号：US20210081708A1

公开(公告)日：2021-03-18

申请号：US16571321

申请日：2019-09-16

申请人： International Business Machines Corporation

发明人： Nathalie Baracaldo Angel , Bryant Chen , Biplav Srivastava , Heiko H. Ludwig

IPC分类号： G06K9/62 , G06N3/08 , G06N3/04

摘要： Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained network using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes analyzing, for each cluster, a distance of a median of the activations therein to medians of the activations in the labels.

3.

发明申请
DETECTING AND MITIGATING POISON ATTACKS USING DATA PROVENANCE 审中-公开

公开(公告)号：US20200019821A1

公开(公告)日：2020-01-16

申请号：US16031953

申请日：2018-07-10

申请人： International Business Machines Corporation

发明人： Nathalie Baracaldo-Angel , Bryant Chen , Evelyn Duesterwald , Heiko H. Ludwig

IPC分类号： G06K9/62 , G06F15/18 , H04L29/06

摘要： Computer-implemented methods, program products, and systems for provenance-based defense against poison attacks are disclosed. In one approach, a method includes: receiving observations and corresponding provenance data from data sources; determining whether the observations are poisoned based on the corresponding provenance data; and removing the poisoned observation(s) from a final training dataset used to train a final prediction model. Another implementation involves provenance-based defense against poison attacks in a fully untrusted data environment. Untrusted data points are grouped according to provenance signature, and the groups are used to train learning algorithms and generate complete and filtered prediction models. The results of applying the prediction models to an evaluation dataset are compared, and poisoned data points identified where the performance of the filtered prediction model exceeds the performance of the complete prediction model. Poisoned data points are removed from the set to generate a final prediction model.

4.

发明授权
Detection of an adversarial backdoor attack on a trained model at inference time 有权

公开(公告)号：US11601468B2

公开(公告)日：2023-03-07

申请号：US16451110

申请日：2019-06-25

申请人： International Business Machines Corporation

发明人： Nathalie Baracaldo Angel , Yi Zhou , Bryant Chen , Ali Anwar , Heiko H. Ludwig

IPC分类号： H04L9/40 , G06N5/04 , G06N20/00

摘要： Systems, computer-implemented methods, and computer program products that can facilitate detection of an adversarial backdoor attack on a trained model at inference time are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a log component that records predictions and corresponding activation values generated by a trained model based on inference requests. The computer executable components can further comprise an analysis component that employs a model at an inference time to detect a backdoor trigger request based on the predictions and the corresponding activation values. In some embodiments, the log component records the predictions and the corresponding activation values from one or more layers of the trained model.

5.

发明授权
Detecting poisoning attacks on neural networks by activation clustering 有权

公开(公告)号：US11188789B2

公开(公告)日：2021-11-30

申请号：US16057706

申请日：2018-08-07

申请人： International Business Machines Corporation

发明人： Bryant Chen , Wilka Carvalho , Heiko H. Ludwig , Ian Michael Molloy , Taesung Lee , Jialong Zhang , Benjamin J. Edwards

IPC分类号： G06K9/62 , G06N3/08 , G06N3/04

摘要： One embodiment provides a method comprising receiving a training set comprising a plurality of data points, where a neural network is trained as a classifier based on the training set. The method further comprises, for each data point of the training set, classifying the data point with one of a plurality of classification labels using the trained neural network, and recording neuronal activations of a portion of the trained neural network in response to the data point. The method further comprises, for each classification label that a portion of the training set has been classified with, clustering a portion of all recorded neuronal activations that are in response to the portion of the training set, and detecting one or more poisonous data points in the portion of the training set based on the clustering.

6.

发明授权
Automatically determining poisonous attacks on neural networks 有权

公开(公告)号：US11645515B2

公开(公告)日：2023-05-09

申请号：US16571323

申请日：2019-09-16

申请人： International Business Machines Corporation

发明人： Nathalie Baracaldo Angel , Bryant Chen , Biplav Srivastava , Heiko H. Ludwig

IPC分类号： G06G7/00 , G06N3/08 , G06N20/00 , G06F18/23 , G06F18/24 , G06V10/762 , G06V10/771 , G06V10/776

CPC分类号： G06N3/08 , G06F18/23 , G06F18/24 , G06N20/00 , G06V10/762 , G06V10/771 , G06V10/776

摘要： Embodiments relate to a system, program product, and method for automatically determining which activation data points in a neural model have been poisoned to erroneously indicate association with a particular label or labels. A neural network is trained using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of the last hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a cluster assessment is conducted for each cluster associated with each label to distinguish clusters with potentially poisoned activations from clusters populated with legitimate activations. The assessment includes executing a set of analyses and integrating the results of the analyses into a determination as to whether a training data set is poisonous based on determining if resultant activation clusters are poisoned.

7.

发明授权
Detecting backdoor attacks using exclusionary reclassification 有权

公开(公告)号：US11538236B2

公开(公告)日：2022-12-27

申请号：US16571318

申请日：2019-09-16

申请人： International Business Machines Corporation

发明人： Nathalie Baracaldo Angel , Bryant Chen , Heiko H. Ludwig

IPC分类号： G06V10/774 , G06K9/62 , G06N3/04 , G06N3/08 , G06V10/764 , G06V10/762

摘要： Embodiments relate to a system, program product, and method for processing an untrusted data set to automatically determine which data points there are poisonous. A neural network is trained network using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of at least one hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a clustering assessment is conducted to remove an identified cluster from the data set, form a new training set, and train a second neural model with the new training set. The removed cluster and corresponding data are applied to the trained second neural model to analyze and classify data in the removed cluster as either legitimate or poisonous.

8.

发明申请
ADVERSARIAL INTERPOLATION BACKDOOR DETECTION 有权

公开(公告)号：US20220114259A1

公开(公告)日：2022-04-14

申请号：US17068853

申请日：2020-10-13

申请人： International Business Machines Corporation

发明人： Heiko H. Ludwig , Ebube Chuba , Bryant Chen , Benjamin James Edwards , Taesung Lee , Ian Michael Molloy

IPC分类号： G06F21/56 , G06N20/00 , G06N5/04

摘要： One or more computer processors determine a tolerance value, and a norm value associated with an untrusted model and an adversarial training method. The one or more computer processors generate a plurality of interpolated adversarial images ranging between a pair of images utilizing the adversarial training method, wherein each image in the pair of images is from a different class. The one or more computer processors detect a backdoor associated with the untrusted model utilizing the generated plurality of interpolated adversarial images. The one or more computer processors harden the untrusted model by training the untrusted model with the generated plurality of interpolated adversarial images.

9.

发明申请
DETECTING POISONING ATTACKS ON NEURAL NETWORKS BY ACTIVATION CLUSTERING 审中-公开

公开(公告)号：US20200050945A1

公开(公告)日：2020-02-13

申请号：US16057706

申请日：2018-08-07

申请人： International Business Machines Corporation

发明人： Bryant Chen , Wilka Carvalho , Heiko H. Ludwig , Ian Michael Molloy , Taesung Lee , Jialong Zhang , Benjamin J. Edwards

IPC分类号： G06N3/08 , G06N3/04

摘要： One embodiment provides a method comprising receiving a training set comprising a plurality of data points, where a neural network is trained as a classifier based on the training set. The method further comprises, for each data point of the training set, classifying the data point with one of a plurality of classification labels using the trained neural network, and recording neuronal activations of a portion of the trained neural network in response to the data point. The method further comprises, for each classification label that a portion of the training set has been classified with, clustering a portion of all recorded neuronal activations that are in response to the portion of the training set, and detecting one or more poisonous data points in the portion of the training set based on the clustering.

10.

发明授权
Adversarial interpolation backdoor detection 有权

公开(公告)号：US12019747B2

公开(公告)日：2024-06-25

申请号：US17068853

申请日：2020-10-13

申请人： International Business Machines Corporation

发明人： Heiko H. Ludwig , Ebube Chuba , Bryant Chen , Benjamin James Edwards , Taesung Lee , Ian Michael Molloy

IPC分类号： G06F21/56 , G06N5/04 , G06N20/00

CPC分类号： G06F21/566 , G06N5/04 , G06N20/00 , G06F2221/034

摘要： One or more computer processors determine a tolerance value, and a norm value associated with an untrusted model and an adversarial training method. The one or more computer processors generate a plurality of interpolated adversarial images ranging between a pair of images utilizing the adversarial training method, wherein each image in the pair of images is from a different class. The one or more computer processors detect a backdoor associated with the untrusted model utilizing the generated plurality of interpolated adversarial images. The one or more computer processors harden the untrusted model by training the untrusted model with the generated plurality of interpolated adversarial images.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类