Hashdoop: A MapReduce framework for network anomaly detection
Hashdoop: A MapReduce framework for network anomaly detection.Anomaly detection is essential for preventing network outages and maintaining the network resources available. However, to cope with the increasing growth of Internet traffic, network anomaly detectors are only exposed to sampled traffic, so harmful traffic may avoid detector examination. In this paper, we investigate the benefits of recent distributed computing approaches for real-time analysis of non-sampled Internet traffic. Focusing on the MapReduce model, our study uncovers a fundamental difficulty in order to detect network traffic anomalies by using Hadoop.
Since MapReduce requires the dataset to be divided into small splits and anomaly detectors compute statistics from spatial and temporal traffic structures, special care should be taken when splitting traffic.We propose Hashdoop, a MapReduce framework that splits traffic with a hash function to preserve traffic structures and, hence, profits of distributed computing infrastructures to detect network anomalies. The benefits of Hashdoop are evaluated with two anomaly detectors and fifteen traces of Internet backbone traffic captured between 2001 and 2013. Using a 6-node cluster Hashdoop increased the throughput of the slowest detector with a speed-up of 15; thus, enabling real-time detection for the largest analyzed traces. Hashdoop also improved the overall detectors accuracy as splits emphasized anomalies by reducing the surrounding traffic.
Similar IEEE Project Titles
- FedLoop: Looping on Federated MapReduce
- Impact of MapReduce Task Re-execution Policy on Job Completion Reliability and Job Completion Time
- MaPLE: A MapReduce Pipeline for Lattice-based Evaluation and its application to SNOMED CT
- LIBRA: Lightweight Data Skew Mitigation in MapReduce
- A Platform to Deploy Customized Scientific Virtual Infrastructures on the Cloud