A virtual machine based task scheduling approach to improving data locality for virtualized Hadoop.

A virtual machine based task scheduling approach to improving data locality for virtualized Hadoop.

                  A virtual machine based task scheduling approach to improving data locality for virtualized Hadoop.MapReduce emerges as an important distributed programming paradigm for large-scale data analysis applications. As an open-source implementation of MapReduce, Hadoop presents an attractive usage system for many enterprises. There are some drawbacks in a traditional Hadoop cluster deployed with a large scale of physical machines, such as burdensome cluster management and fluctuating resource utilization. Virtualized Hadoop cluster not only simplifies cluster management, but also facilitates cost-effective workload consolidation for resource utilization. In Hadoop system, the data locality is a critical factor impacting on performance of MapReduce applications. However, existing task scheduling approaches to improving data locality of virtualized Hadoop are not effective because of two levels distribution of data: virtual machines and physical servers.

Hadoop-Projects

Hadoop-Projects

In this paper, we deploy virtualized Hadoop cluster in which computing node and storage node are placed in respective virtual machines to improve flexibility. We propose a novel task scheduling approach which aims to improve data locality for virtualized Hadoop cluster through migrating the virtual machine acted as computing node to the physical server running virtual machine acted as storage node that holds a data replica needed by that computing node. We evaluated our approach’s efficiency on a virtualized Hadoop cluster with the aforementioned deployment for 11 computing nodes and 12 storage nodes. Our experiment results show that our approach improves performance of 86% typical MapReduce applications in our benchmark suite at varying degrees.

Similar IEEE  Project Titles

Save

Save

Save

Save


Work Progress

PHD - 24

M.TECH - 125

B.TECH -95

BIG DATA -110.

HADOOP -90.

ON-GOING Hadoop Projects

HADOOP MAP -90.

HADOOP YARN -27.

HADOOP HEBROS - 25.

HADOOP ZOOKEEPER -18.

Achievements – Hadoop Solutions

Hadoop-Projects-Achievement-Awards

Twitter Feed

Customer Review

Hadoop Solutions 5 Star Rating: Recommended 4.9 - 5 based on 1000+ ratings. 1000+ user reviews.