ML Based Network Intrusion Detection for SPIRE

Although Spire is intrusion tolerant (i.e. potential attackers cannot execute malicious operations or significantly affect the timeliness of the system), it does not provide alerts or feedback about potential compromises. Implementing such a system to allow for situational awareness could be invaluable to system operators, allowing them to diagnose and fix problems. From previous field tests of Spire as well as our own research, we decided that using Machine Learning would be the best way to approach this problem, as it would allow for detection of novel attcks. Following is a general overview of our system:

  • Collection of Spire's normal network traffic using SPAN on external dissemination network switch.
  • Data pipeline that parses packets and then stores for training or predicts on them (see diagram)
  • Traffic pattern based prediction using packet counts per minute. Better at detecting larger scale attacks such as denial of service
  • Packet analysis based prediction that clusters similar individual packets. Can detect packets that are unusual with respect to normal system operation
  • Majority voting between different algorithms to reduce false positive rate

For both types of predictions, we used sklearns implementation of various novelty detection algorithms (an overview of these can be found here).

Architecture of ML based IDS

Attack Vectors:

To test our ML models, we generated attack vectors that replicate some well known network level attacks:

  • Port Scanning
  • Denial of Service (DOS)
  • Address Resolution Protocol (ARP) Poisoning
  • Replay Attacks

To generate out testbed, we systematically varied parameters in our attack generation scripts, so that different variations of the above mentioned attacks are generated.

We ran the Spire system, and launched these attacks, to test the performance of our ML models.The table summarized the number of attacks detected by each predictor and overall system for a total of 28 attacks.

Packet Analysis Model || Traffic Pattern Model || Overall System ||
Accuracy 25/28(89.2%) 22/28(78.6%) 27/28(96.5%)

Demos:

In the videos below the top - right screen is output log of Traffic Pattern based Predictor which predicts the past minute's traffic as either normal or abnormal. The top-left screen is output log of Packet Analysis based predictor printing as soon as an attack packet is predicted. The bottom window is used to generate attacks.

Demo 1: Probing/ Scanning

The packets in such attack are low in volume but their headers vary as such attacks are intended to explore the system. In the video we generate random UDP packets mimicking the genuine IPs. The Packet Analysis based Predictor (top- left screen) immediately detects the packets and prints their summary. The Traffic Pattern based Predictor waits till the end of current interval (60 sec) to give summary stats of the abnormal traffic it has observed.

Demo 2: DoS

In this Denial of Service attack a plc is targeted. These packets headers may have some matching fields as good packets and are also generally high in volume (as the target is to exhaust resources). The Packet Analysis based Predictor detects these packets (highlighted in the video) along with Traffic Pattern based Predictor.

Demo 3: Replay Attack with certain volume

As the name indicates the good packets are captured and are used for DoS attacks. As the headers match exactly with good packets the Packet Analysis based Predictor fails to detect them. Due to their significant volume the Traffic Pattern based Predictor will be able to detect them and provide a summary that the volume is more that expected.

Demo 4: Replay Attack with very low volume

The packets are captured good traffic and are also very low in volume. Hence, both predictors fail to detect this scenario. But, the system is tolerant to such attacks at both network level and system level causing no adverse impact.

Demo 5: Byzantine Node Attack

In this scenario a node is assumed to be compromized. It is capable of sending DoS traffic directed towards other components which are part of system but not part of its intended communication. The packet Analysis based traffic detects in mixed header cases and the Traffic Pattern based Predictor detects when such scenarios have even low volume of traffic.