MULTIPLE PAIRWISE FEATURE HISTOGRAMS FOR REPRESENTING NETWORK TRAFFIC

    公开(公告)号:US20190114416A1

    公开(公告)日:2019-04-18

    申请号:US15730949

    申请日:2017-10-12

    Abstract: In one embodiment, a device divides groups of tuples of traffic characteristics of encrypted network traffic into different pairs of the characteristics. Each of the pairs has a corresponding two dimensional (2-D) feature subspace. The device discretizes the 2-D feature subspaces, to form a plurality of bins in each feature subspace. The device assigns the pairs of the traffic characteristics in a particular group of tuples to the bins in the discretized 2-D feature subspaces. The device forms, for each group of tuples, a vector representation of the group of tuples based on the bins in the discretized 2-D feature subspaces to which the pairs of the traffic characteristics from the group are assigned. The vector representations of the groups of tuples are of a fixed dimension. The device uses the vector representations of the groups of tuples to train a machine learning-based traffic classifier.

    Instant network threat detection system

    公开(公告)号:US11374944B2

    公开(公告)日:2022-06-28

    申请号:US16224963

    申请日:2018-12-19

    Abstract: In one embodiment, a network security service forms, for each of a plurality of malware classes, a feature vector descriptor for the malware class. The service uses the feature vector descriptors for the malware classes and a symmetric mapping function to generate a training dataset having both positively and negatively labeled feature vectors. The service trains, using the training dataset, an instant threat detector to determine whether telemetry data for a particular traffic flow is within a threshold of similarity to a feature vector descriptor for a new malware class that was not part of the plurality of malware classes.

    MULTIPLE INSTANCE LEARNING MODELS FOR CYBERSECURITY USING JAVASCRIPT OBJECT NOTATION (JSON) TRAINING DATA

    公开(公告)号:US20230376836A1

    公开(公告)日:2023-11-23

    申请号:US17749740

    申请日:2022-05-20

    CPC classification number: G06N20/00 H04L63/1441

    Abstract: Techniques and architecture are described for converting tree structured data such as, for example, JavaScript Object Notation (JSON) data, into multiple feature vectors to train multiple instance learning (MIL) models for providing cybersecurity in networks. In particular, a data set is provided, wherein the data set comprises a sample configured as a hierarchal tree. The sample is converted into a set of path and value pairs, e.g., flattened into a set of path and value pairs, where the path is a sequence of field names and array indices encoding a position of a value. Each path and value pair of the set of path and value pairs is converted into a respective feature vector to form a set of feature vectors. The set of feature vectors is used to train a multiple instance learning (MIL) model, wherein each feature vector has a same, fixed length.

    Malware detection using inverse imbalance subspace searching

    公开(公告)号:US11799904B2

    公开(公告)日:2023-10-24

    申请号:US17117942

    申请日:2020-12-10

    Abstract: Inverse imbalance subspace searching techniques are used to detect potential malware among samples of network communication data. A large number of samples of network communication data, such as proxy log data and/or network flows, are received and analyzed by a malware detection system. A number of the samples are associated with known malware, while other unlabeled samples are either benign or may be associated with unknown malware. An inverse imbalance subspace search may be performed, in which the sample sets are divided into subsets based on random feature thresholds, and each subset is evaluated based on the ratio of known malware samples to unlabeled samples. Unlabeled samples within subsets having high malware sample ratios may be identified, aggregated, and processed as potential malware.

    MALWARE DETECTION USING INVERSE IMBALANCE SUBSPACE SEARCHING

    公开(公告)号:US20220191244A1

    公开(公告)日:2022-06-16

    申请号:US17117942

    申请日:2020-12-10

    Abstract: Inverse imbalance subspace searching techniques are used to detect potential malware among samples of network communication data. A large number of samples of network communication data, such as proxy log data and/or network flows, are received and analyzed by a malware detection system. A number of the samples are associated with known malware, while other unlabeled samples are either benign or may be associated with unknown malware. An inverse imbalance subspace search may be performed, in which the sample sets are divided into subsets based on random feature thresholds, and each subset is evaluated based on the ratio of known malware samples to unlabeled samples. Unlabeled samples within subsets having high malware sample ratios may be identified, aggregated, and processed as potential malware.

Patent Agency Ranking