›INDEX
Last Updated:

Dataset Replication Possibilities

Evaluation Framework For IDS Datasets

Notes for "An Evaluation Framework for Instruction Detection Dataset" (Amirhossein Gharib, Iman Sharafaldin, et al.)

Options to get data for evaluation of defence techniques:

  • Replaying publicly available dataset from attacks.
  • Generating traffic: not great for simulating attack, Curl-Loader is an open-source tool to generate artificial traffic.

Scott et al. presented three major criteria in dataset:

  • redundancy
  • inherent unpredictability
  • complexity or multivariate dependencies.

Shiravi et al. defined evaluation criteria with six aspects:

  • realistic network
  • realistic traffic
  • labeled dataset
  • total interaction capture
  • complete capture
  • diversity of attacks.

The eleven features defined by the framework

  1. Complete Network Configuration: Several attack only reveal themselves with a complete configuration from computers, servers, routers, firewalls. So a realistic configuration is necessary to capture the real effects of attacks.

  2. Complete Traffic: Sequence of packets from source that can be a host, router, or switch to a destination which may be another host, a multicast group, or a broadcast domain.

  3. Labeled Dataset: Tagging and labeling the data.

  4. Complete Interactions: Having all information about network interactions such between internal LANs.

  5. Complete Capture: Shouldn't remove traffic which is non-functional or not labeled since it is important in calculation of false-positive percentage of an IDS system.

  6. Available Protocols: Interactive traffic includes session that consist of short request and response pairs such as applications involving real-time interactions with users. Should consist of both latency sensitive and non-real-time data.

  7. Attack Diversity: Almost self-explanatory - must contain a variety of attacks.

  8. Anonymity: Privacy compromising issues occurs when both the IP and payload are available. Removing payload decreases the usefulness of the dataset for systems like deep packet inspection (DPI).

  9. Heterogeneity: Different sources of information from things like operating system logs, network equipment logs, network traffic etc.

  10. Feature Set: Extract different features from different data sources such as logs and traffic using feature extraction applications.

  11. Metadata: Include proper documentation about configuration, systems, attack scenarios, and other vital information.

Generating Reliable Dataset

Benign Profile (B-Profile)

This methods is used to generate benign background traffic, B-Profile is designed to extract the abstract behaviour of a group of human users. The method uses machine learning models and statistical analysis techniques to capture the abstract features.

The encapsulated features are distributions of:

  • packet sizes of a protocol
  • number of packets per flow
  • certain patterns in the payload
  • size of payload
  • request time distribution of protocols.

There are two steps for creating benign profiles:

  • Individual Profiling: A rich dataset should contain events from HTTP, HTTPS, FTP, SSH, and email protocols. These can be captured using Man In The Middle (MITM), network sniffing, browser and email histories.

  • Clustering: In the clustering step, individual user profiles are analyzed against other users to create clusters of users with similar behaviour and distributions. The authors found best results with XMeans algorithm using the distance algorithm of Dynamic Time Warping (DTW) for measuring similarity between two given time-dependent sequences.

To generate traffic, a random B-Profile is selected and a slightly modified web-crawling mechanism is devised to demonstrate the browsing behaviour of users for HTTP and HTTPS.

benign profiling design

Attack Profiles (M-Profile)

TODO: Will describe later

CIC-IDS2017

This is an overview of the creation process of the CIC-IDS2017 dataset. This comes from the paper "Towards Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization" by Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani. I also use the details mentioned on the web page where the dataset is available: https://www.unb.ca/cic/datasets/ids-2017.html.

The dataset captures network data in the form of PCAPs and then it performs network traffic analysis using the CICFlowMeter tool to create labelled flows based on time stamp, source, and destination IPs, source and destination ports, protocols and attacks.

This dataset claims to create realistic background traffic using a proposed B-Profile system to profile the abstract behaviour of human interactions and generate naturalistic benign background traffic.

Overview From Web Page

The dataset generation included 5 days of data collection:

  • Day 1: Only benign traffic.
  • Day 2: Brute-force, FTP-patator, SSH-Patator
  • Day 3: DoS/DDos, DoS slowloris, DoS Slowhttptest, DoS Hulk, DoS GoldenEye, Heartbleed.
  • Day 4: Web Attack Brute force, web attack XSS, web attack SQL injection, Infiltration by dropbox download, infiltration by cool disk (Mac).
  • Day 5: Botnet ARES, Port Scan, DDoS LOIT.

There were 14 victim machines (1 web server, 1 ubuntu server, 4 ubuntu machines, 5 windows machines, and 1 Mac machine) being attacked by 4 attacker machines (1 kali linux and 3 windows machines).

Overview From Paper

The authors claims good datasets do not exist to evaluate the performance of IDS techniques. Even the ones that become available are very heavily anonymized and do not reflect any current trends.

The paper contributes in creating a new dataset which covers all eleven necessary criteria with common updated attacks and the rest of the paper analyzes the remaining datasets and their own dataset.

They authors extracts 80 traffic features from the dataset using the CICFlowMeter tool.

Experiment

Two networks: the attack network and victim network. The victim network consists of a firewall, router, switches, and most common operating systems along with an agent that provides the benign behaviour on each machine. The attack network is a completely separate network with its own router and switch and machines with public IPs.

Benign Profile Agent (B-Profile)

This dataset uses the proposed B-Profile system from the paper "Towards a Reliable Intrusion Detection Benchmark Dataset" (Sharafaldin et al., 2017) which profiles human interactions and "generates naturalistic benign background traffic".

The profile is generated from 25 users based on HTTP, HTTPS, FTP, SSH, and email protocols.

Attack Vectors

The dataset uses the following attack vectors to attack the different machines:

  • Brute Force Attacks
  • Heartbleed Attack
  • Botnet
  • DoS Attacks
  • DDoS Attacks
  • Web Attacks
  • Infiltration Attacks

CIC-IDS2018 On AWS

  • Dataset based on creation of user profiles which contain abstract representation of events and behaviours seen on the network.

  • Attack types: Brute-force, Heartbleed, Botnet, DoS, DDos, Web attacks, and infiltration of network from inside.

  • Attacking infrastructure includes 50 machines and the victim organization has 5 departments and includes 420 machines and 30 servers.

  • 80 features extracted from the captured traffic using CICFlowMeter-V3

Source

Attack Approaches

  • Infiltration of network from inside: malicious file then backdoor executed, scanning internal network for other vulnerable machines.

  • HTTP DoS: Uses Slowloris, LOIC, and HOIC. Exploits open TCP connections sending valid but incomplete HTTP requests.

  • Web Attacks: Damn Vulnerable Web App (DVWA) - attacks on vulnerabilities on website - SQL injection, command injection, and unrestricted file upload.

  • Brute force attacks: weak username and password combinations - final goal of acquiring an SSH and MySQL account by running a dictionary brute force attack against the main server.

  • Last updated attacks: Attacks that are from famous vulnerabilities that can be conducted during a specific amount of time - sometimes affecting millions of computers taking time to patch these Heartbleed is one such attack (2018).

Benign Traffic

B-profile is designed to extract the abstract behaviour of a group of human users. This tried to encapsulate network events produced by users with machine learning and statistical analysis techniques.

Once B-Profiles are derived from users, an agent (CIC-BenignGenerator) or a human operator can use them to generate realistic benign events on a network.

Process

For each data raw data was recorded including the network traffic (Pcaps) and event logs (windows and ubuntu event logs) per machine.

Problems With CIC-IDS2017 and CIC-IDS2018

The paper "Error prevalence in NIDS datasets: A case study on CIC-IDS-2017 and CSE-CIC-IDS-2018" by Lisa Liu et al. details multiple problems with the datasets.

Missed Attacks

Various malicious flows are missed in the original dataset. The dataset is "severely imbalanced" in favor of benign traffic. These are caused by various factors such as imprecise time frame accounts, to incorrect attack assignments. The paper publishes a table of all the attacks missed by the original dataset:

table of attack labels missed

Mislabelling

Here is an overview of flows that were mis-labelled by the original datasets:

  • empty payload (no traffic just tcp start and finish) labelled at malicious.

  • port/system closed (malicious traffic sent to a system that is down or unavailable.)

  • attack startup/teardown artifacts (parts of attack traffic that aren't distinguishable from regular traffic). Example, some attacks require loading the front page before the start of the attack - due to absence of malicious payload in this phase, they appear semantically identical to a benign user browsing the web-app, within the context of a single flow.

    • Could this be used as a way to detect this form of attacks?
    • Maybe the startup/teardown sections need to be marked as benign even though they may be correlated to a malicious attack.
  • no malicious payload this is a case where the payload exists but due to the flow timeout set by CICFlowMeter, the first flow contains all the malicious content and the latter does not contain any malicious content.

  • attack artifact: regular traffic between attacker and victim unrelated to the attack. This is marked malicious.

  • target system unresponsive: the target is unresponsive potentially due to being overwhelmed.

  • time-based labelling: without accurate host and port filtering, time-based labelling leads to traffic not involved in the attack being marked malicious.

  • ambiguous class labels: if you remove the flow id, source ip, source port, destination ip and timestamp then there are duplicate rows which are labelled differently.

Modification To CICFlowMeter Tool

There are number of issues/changes with the CICFlowMeter tool that affect the flow integrity and attack characterization.

  • packet time-stamping issue: Sometime the SYN ACK packets arrive before the SYN packets from the attacker. This is probability due to the operating system being responsible for time-stamping. Therefore, dataset creators should verify both directions of traffic when labelling flow.

  • TCP segmentation offset: TCP segmentation offloading (TSO) leads to IP length of 0 in the header. Since packet headers are used to determind how packets are dissected, these are put into a different flow. This affects attacks with large payloads.

  • attributes: some new attributes were added to better handle flows. The paper lists four new attributes added to the CICFlowMeter tool.

Impact on Training

The authors test their new cleaned dataset against existing models to see the differences between the datasets and to understand if the models were over-fitting to the existing data.

Automated Detection Of Labelling Errors

The authors use their manually corrected dataset as a ground-truth which they use to automate detection of labelling errors. The use Confident Learning and O2U-Net that have been used to detect labelling errors in the field of computing vision.

Enjoy the notes on this website? Consider supporting me in this adventure in you preferred way: Support me.