Abstract:
In one embodiment, data flows are received in a network, and information relating to the received data flows is provided to a machine learning attack detector. Then, in response to receiving an attack detection indication from the machine teaming attack detector, a traffic segregation procedure is performed including: computing an anomaly score for each of the received data flows based on a degree of divergence from an expected traffic model, determining a subset of the received data flows that have an anomaly score that is lower than or equal to an anomaly threshold value, and providing information relating to the subset of the received data flows to the machine learning attack detector.
Abstract:
In one embodiment, a device receives a classifier tracking request from a coordinator device that specifies a classifier verification time period. During the classifier verification time period, the device classifies a set of network traffic that includes traffic observed by the device and attack traffic specified by the coordinator device. The device generates classification results based on the classified set of network traffic and provides the classification results to the coordinator device.
Abstract:
In one embodiment, a device in a network generates an expected traffic model based on a training set of data used to train a machine learning attack detector. The device provides the expected traffic model to one or more nodes in the network. The device receives an unexpected behavior notification from a particular node of the one or more nodes. The particular node generates the unexpected behavior notification based on a comparison between the expected traffic model and an observed traffic behavior by the node. The particular node also prevents the machine learning attack detector from analyzing the observed traffic behavior. The device updates the machine learning attack detector to account for the observed traffic behavior.
Abstract:
In one embodiment, a network node receives a voting request from a neighboring node that indicates a potential network attack. The network node determines a set of feature values to be used as input to a classifier based on the voting request. The network node also determines whether the potential network attack is present by using the set of feature values as input to the classifier. The network node further sends a vote to the neighboring node that indicates whether the potential network attack was determined to be present.
Abstract:
In one embodiment, attack observations by a first node are provided to a user interface device regarding an attack detected by the node. Input from the user interface device is received that confirms that a particular attack observation by the first node indicates that the attack was detected correctly by the first node. Attack observations by one or more other nodes are provided to the user interface device. Input is received from the user interface device that confirms whether the attack observations by the first node and the attack observations by the one or more other nodes are both related to the attack. The one or more other nodes are identified as potential voters for the first node in a voting-based attack detection mechanism based on the attack observations from the first node and the one or more other nodes being related.
Abstract:
In one embodiment, voting optimization requests that identify a validation data set are sent to a plurality of network nodes. Voting optimization data is received from the plurality of network nodes that was generated by executing classifiers using the validation data set. A set of one or more voting classifiers is then selected from among the classifiers based on the voting optimization data. One or more network nodes that host a voting classifier in the set of one or more selected voting classifiers is then notified of the selection.
Abstract:
In one embodiment, a first network device receives a notification that the first network device has been selected to validate a machine learning model for a second network device. The first network device receives model parameters for the machine learning model that were generated by the second network device using training data on the second network device. The model parameters are used with local data on the first network device to determine performance metrics for the model parameters. The performance metrics are then provided to the second network device.
Abstract:
In one embodiment, a network assurance service that monitors a network detects a network issue in the network using a machine learning model and based on telemetry data captured in the network. The service assigns the detected network issue to an issue cluster by applying clustering to the detected network issue and to a plurality of previously detected network issues. The service selects a set of one or more actions for the detected network issue from among a plurality of actions associated with the previously detected network issues in the issue cluster. The service obtains context data for the detected network issue. The service provides, to a user interface, an indication of the detected network issue, the obtained context data for the detected network issue, and the selected set of one or more actions.
Abstract:
In one embodiment, a service receives telemetry data collected from a plurality of different networks. The service combines the telemetry data into a synthetic input trace. The service inputs the synthetic input trace into a plurality of machine learning models to generate a plurality of predicted key performance indicators (KPIs), each of the models having been trained to assess telemetry data from an associated network in the plurality of different networks and predict a KPI for that network. The service compares the plurality of predicted KPIs to identify one of the plurality of different networks as exhibiting an abnormal behavior.
Abstract:
In one embodiment, a network assurance service maintains a data lake of network telemetry data obtained by the service from any number of computer networks. The service generates a machine learning model for on-premise execution in a particular computer network to detect network issues in the particular network. To do so, the service repeatedly selects a candidate set of model settings based in part on the data lake of network telemetry data, trains a machine learning model using network telemetry data from the data lake that matches the candidate set of model settings, and tests performance of the trained model using an emulator that emulates network issues in the particular network. The service further deploys the generated machine learning model to the particular computer network for on-premise execution.