A study of big data processing constraints on a low-power Hadoop cluster
A study of big data processing constraints on a low-power Hadoop cluster.Big Data processing with Hadoop has been emerging recently, both on the computing cloud and enterprise deployment. However, wide-spread security exploits may hurt the reputation of public clouds. If Hadoop on the cloud is not an option, an organization has to build its own Hadoop clusters. But having a data center is not worth for a small organization both in terms of building and operating costs. Another viable solution is to build a cluster with low-cost ARM system-on-chip boards.
This paper presents a study of a Hadoop cluster for processing Big Data built atop 22 ARM boards. The Hadoop’s MapReduce was replaced by Spark and experiments on three different hardware configurations were conducted to understand limitations and constraints of the cluster. From the experimental results, it can be concluded that processing Big Data on an ARM cluster is highly feasible. The cluster could process a 34 GB Wikipedia article file in acceptable time, while generally consumed the power 0.061-0.322 kWh for all benchmarks. It has been found that I/O of the hardware is fast enough, but the power of CPUs is inadequate because they are largely spent for the Hadoop’s I/O.
Similar IEEE Project Titles
- An experimental approach towards big data for analyzing memory utilization on a hadoop cluster using HDFS and MapReduce
- Hadoop: Addressing challenges of Big Data
- DataMPI: Extending MPI to Hadoop-Like Big Data Computing
- Research on big data information retrieval based on hadoop architecture.
- Effectiveness Assessment of Solid-State Drive Used in Big Data Services