DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters.MapReduce is a popular computing paradigm for large-scale data processing in cloud computing. However, the slot-based MapReduce system (e.g., Hadoop MRv1) can suffer from poor performance due to its unoptimized resource allocation. To address it, this paper identifies and optimizes the resource allocation from three key aspects. First, due to the pre-configuration of distinct map slots and reduce slots which are not fungible, slots can be severely under-utilized. Because map slots might be fully utilized while reduce slots are empty, and vice-versa. We propose an alternative technique called Dynamic Hadoop SlotAllocation by keeping the slot-based model. It relaxes the slot allocation constraint to allow slots to be reallocated to either map or reduce tasks depending on their needs. Second, the speculative execution can tackle the straggler problem, which has shown to improve the performance for a single job but at the expense of the cluster efficiency.
In view of this, we propose Speculative Execution Performance Balancing to balance the performance tradeoff between a single job and a batch of jobs. Third, delay scheduling has shown to improve the data locality but at the cost of fairness. Alternatively, we propose a technique called Slot PreSchedulingthat can improve the data locality but with no impact on fairness. Finally, by combining these techniques together, we form a step-by-step slot allocation system called DynamicMR that can improve the performance of MapReduce workloads substantially. The experimental results show that our DynamicMR can improve the performance of Hadoop MRv1 significantly while maintaining the fairness, by up to 46~115 percent for single jobs and 49~112 percent for multiple jobs. Moreover, we make a comparison with YARN experimentally, showing that DynamicMR outperforms YARN by about 2~9 percent for multiple jobs due to its ratio control mechanism of running map/reduce tasks.
Similar IEEE Project Titles
- Evaluating MapReduce frameworks for iterative Scientific Computing applications
- Efficient way of searching data in MapReduce paradigm
- Enumerating Maximal Bicliques from a Large Graph Using MapReduce
- Scalable community detection from networks by computing edge betweenness on MapReduce
- Hybrid cloud infrastructure to handle large scale data for bangladesh people search (BDPS)