Supporting Queries and Analyses of Large-Scale Social Media Data with Customizable and Scalable Indexing Techniques over NoSQL Databases

Supporting Queries and Analyses of Large-Scale Social Media Data with Customizable and Scalable Indexing Techniques over NoSQL Databases

                                     Supporting Queries and Analyses of Large-Scale Social Media Data with Customizable and Scalable Indexing Techniques over NoSQL DatabasesSocial media data analysis demonstrates two special characteristics in Big Data processing. First, most analyses focus on data subsets related to specific social events or activities instead of the whole dataset. Second, analysis workflows consist of multiple stages, and algorithms applied in each stage may use different computation and communication patterns depending on processing frameworks.This paper presents our efforts in supporting the data storage and processing requirements for such characteristics. To achieve efficient queries about target datasubsets, we propose a general customizable and scalable indexingframework that can be built over distributed NoSQL databases.

Hadoop-Projects

Hadoop-Projects

This framework allows users to define suitable customized index structures for their query patterns against social media data, and supports scalableindexing of both historical and streaming data. We implement this framework on HBase, and name it IndexedHBase. Starting from IndexedHBase, we build a distributed analysis stack based on YARN tosupport analysis algorithms using different processing frameworks, such as Hadoop MapReduce, Harp, and Giraph. This analysis stack is used to host the Truthy social media data observatory, and we have applied the customized index structures in supporting both query evaluation and sophisticated analysis algorithms. Performance tests show that our solutions outperform implementations using both direct raw data scans and current indexing mechanisms in existing NoSQL databases.

Similar IEEE  Project Titles

Save


Work Progress

PHD - 24

M.TECH - 125

B.TECH -95

BIG DATA -110.

HADOOP -90.

ON-GOING Hadoop Projects

HADOOP MAP -90.

HADOOP YARN -27.

HADOOP HEBROS - 25.

HADOOP ZOOKEEPER -18.

Achievements – Hadoop Solutions

Hadoop-Projects-Achievement-Awards

Twitter Feed

Customer Review

Hadoop Solutions 5 Star Rating: Recommended 4.9 - 5 based on 1000+ ratings. 1000+ user reviews.