-
公开(公告)号:US20190114416A1
公开(公告)日:2019-04-18
申请号:US15730949
申请日:2017-10-12
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Petr Somol
Abstract: In one embodiment, a device divides groups of tuples of traffic characteristics of encrypted network traffic into different pairs of the characteristics. Each of the pairs has a corresponding two dimensional (2-D) feature subspace. The device discretizes the 2-D feature subspaces, to form a plurality of bins in each feature subspace. The device assigns the pairs of the traffic characteristics in a particular group of tuples to the bins in the discretized 2-D feature subspaces. The device forms, for each group of tuples, a vector representation of the group of tuples based on the bins in the discretized 2-D feature subspaces to which the pairs of the traffic characteristics from the group are assigned. The vector representations of the groups of tuples are of a fixed dimension. The device uses the vector representations of the groups of tuples to train a machine learning-based traffic classifier.
-
公开(公告)号:US11374944B2
公开(公告)日:2022-06-28
申请号:US16224963
申请日:2018-12-19
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Petr Somol
IPC: H04L29/06 , H04L9/40 , H04L41/142 , G06N20/00 , G06K9/62
Abstract: In one embodiment, a network security service forms, for each of a plurality of malware classes, a feature vector descriptor for the malware class. The service uses the feature vector descriptors for the malware classes and a symmetric mapping function to generate a training dataset having both positively and negatively labeled feature vectors. The service trains, using the training dataset, an instant threat detector to determine whether telemetry data for a particular traffic flow is within a threshold of similarity to a feature vector descriptor for a new malware class that was not part of the plurality of malware classes.
-
3.
公开(公告)号:US20190123982A1
公开(公告)日:2019-04-25
申请号:US15790402
申请日:2017-10-23
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Martin Vejman , Petr Somol
Abstract: In one embodiment, a device groups feature vectors representing network traffic flows into bags. The device forms a bag representation of a particular one of the bags by aggregating the feature vectors in the particular bag. The device extends one or more feature vectors in the particular bag with the bag representation. The extended one or more feature vectors are positive examples of a classification label for the network traffic. The device trains a network traffic classifier using training data that comprises the one or more feature vectors extended with the bag representation.
-
4.
公开(公告)号:US20230376836A1
公开(公告)日:2023-11-23
申请号:US17749740
申请日:2022-05-20
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Stepan Dvorak , Jan Brabec
CPC classification number: G06N20/00 , H04L63/1441
Abstract: Techniques and architecture are described for converting tree structured data such as, for example, JavaScript Object Notation (JSON) data, into multiple feature vectors to train multiple instance learning (MIL) models for providing cybersecurity in networks. In particular, a data set is provided, wherein the data set comprises a sample configured as a hierarchal tree. The sample is converted into a set of path and value pairs, e.g., flattened into a set of path and value pairs, where the path is a sequence of field names and array indices encoding a position of a value. Each path and value pair of the set of path and value pairs is converted into a respective feature vector to form a set of feature vectors. The set of feature vectors is used to train a multiple instance learning (MIL) model, wherein each feature vector has a same, fixed length.
-
公开(公告)号:US11799904B2
公开(公告)日:2023-10-24
申请号:US17117942
申请日:2020-12-10
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Jan Brabec , Cenek Skarda
IPC: H04L9/40
CPC classification number: H04L63/1466 , H04L63/1416 , H04L63/1425 , H04L63/1433 , H04L63/20
Abstract: Inverse imbalance subspace searching techniques are used to detect potential malware among samples of network communication data. A large number of samples of network communication data, such as proxy log data and/or network flows, are received and analyzed by a malware detection system. A number of the samples are associated with known malware, while other unlabeled samples are either benign or may be associated with unknown malware. An inverse imbalance subspace search may be performed, in which the sample sets are divided into subsets based on random feature thresholds, and each subset is evaluated based on the ratio of known malware samples to unlabeled samples. Unlabeled samples within subsets having high malware sample ratios may be identified, aggregated, and processed as potential malware.
-
公开(公告)号:US20220191244A1
公开(公告)日:2022-06-16
申请号:US17117942
申请日:2020-12-10
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Jan Brabec , Cenek Skarda
IPC: H04L29/06
Abstract: Inverse imbalance subspace searching techniques are used to detect potential malware among samples of network communication data. A large number of samples of network communication data, such as proxy log data and/or network flows, are received and analyzed by a malware detection system. A number of the samples are associated with known malware, while other unlabeled samples are either benign or may be associated with unknown malware. An inverse imbalance subspace search may be performed, in which the sample sets are divided into subsets based on random feature thresholds, and each subset is evaluated based on the ratio of known malware samples to unlabeled samples. Unlabeled samples within subsets having high malware sample ratios may be identified, aggregated, and processed as potential malware.
-
7.
公开(公告)号:US11271833B2
公开(公告)日:2022-03-08
申请号:US15790402
申请日:2017-10-23
Applicant: Cisco Technology, Inc.
Inventor: Tomas Komarek , Martin Vejman , Petr Somol
IPC: H04L12/26 , H04L43/062 , H04L29/06 , G06N20/00 , H04L41/16 , H04L43/026
Abstract: In one embodiment, a device groups feature vectors representing network traffic flows into bags. The device forms a bag representation of a particular one of the bags by aggregating the feature vectors in the particular bag. The device extends one or more feature vectors in the particular bag with the bag representation. The extended one or more feature vectors are positive examples of a classification label for the network traffic. The device trains a network traffic classifier using training data that comprises the one or more feature vectors extended with the bag representation.
-
-
-
-
-
-