CCF: Fast and scalable connected component computation in MapReduce
CCF: Fast and scalable connected component computation in MapReduce.Finding connected components in a graph is a well-known problem in a wide variety of application areas such as social network analysis, data mining, image processing, and etc.In this paper, we present an efficient and scalable approach in MapReduce to find all the connected components in a given graph.
We compare our approach with the state-of-the-art on a real-world graph. We also demonstrate the viability of our approach on a massive graph with ~6B nodes and ~92B edges on an 80-node hadoop cluster. To the best of our knowledge, this is the largest graph publicly used in such an experiment.