HadoopWatch: A first step towards comprehensive traffic forecasting in cloud computing
HadoopWatch: A first step towards comprehensive traffic forecasting in cloud computing.This paper presents our effort towards comprehensive traffic forecasting for big data applications using external, light-weighted file system monitoring. Our idea is motivated by the key observations that rich traffic demand information already exists in the log and meta-data files of many big data applications, and that such information can be readily extracted through run-time file system monitoring.
As the first step, we use Hadoop as a concrete example to explore our methodology and develop a system called HadoopWatch to predict traffic demand of Hadoop applications. We further implement HadoopWatch in our real small-scale testbed with 10 physical servers and 30 virtual machines. Our experiments over a series of MapReduce applications demonstrate that HadoopWatch can forecast the traffic demand with almost 100% accuracy and time advance. Furthermore, it makes no modification of the Hadoop framework, and introduces little overhead to the application performance.
Similar IEEE Project Titles
- FedLoop: Looping on Federated MapReduce
- Impact of MapReduce Task Re-execution Policy on Job Completion Reliability and Job Completion Time
- MaPLE: A MapReduce Pipeline for Lattice-based Evaluation and its application to SNOMED CT
- LIBRA: Lightweight Data Skew Mitigation in MapReduce
- A Platform to Deploy Customized Scientific Virtual Infrastructures on the Cloud