An enhanced agglomerative fuzzy k-means clustering method with mapreduce implementation on Hadoop platform
An enhanced agglomerative fuzzy k-means clustering method with mapreduce implementation on Hadoop platform.In this Paper, an enhanced agglomerative fuzzy K-Means clustering algorithm with the MapReduce implementation is proposed. In this algorithm, an initial center selection method is introduced to improve the accuracy and increase the convergence speed of the agglomerative fuzzy k-means algorithm.
Then, a MapReduce implementation based on Apache Hadoop is presented to increase the scalability for large scale datasets. Experiments were respectively conducted on a synthetic data set, the WINE dataset from UCI Repository and a randomly generated large dataset. The experimental results show that the proposed algorithm can identify true cluster number and produce accurate result with good scalability on large dataset.
Similar IEEE Project Titles
- Performance Modeling for RDMA-Enhanced Hadoop MapReduce
- Leveraging hadoop framework to develop duplication detector and analysis using Mapreduce, Hive and Pig
- Automatic Detection and Rectification of DNS Reflection Amplification Attacks with Hadoop MapReduce and Chukwa
- Mammoth: Gearing Hadoop Towards Memory-Intensive MapReduce Applications
- Towards a cost-efficient MapReduce: Mitigating power peaks for Hadoop clusters