Category Archives: projects on hadoop

  • -

Projects on Hadoop

Projects on Hadoop 

Projects on Hadoop deals on processing large amount data using cluster commodity hardware.



  • An Abnormal Network Behavior Detection System Based on Compound Session                                                                                                                                                                                                                                   In recent years, with the rapid development of the Internet on a global scale and the prompt popularization of various App applications, the Internet is increasingly becoming an integral part of people’s lives. Meanwhile various network problems caused by abnormal networkbehavior have become more prominent than any time before. Furthermore we also have a lot of personal information on the Internet, which will bring us significant losses if are gave away. For that, to find an effective method to detect the abnormal network behavior is becoming more and more important. This paper first introduces a new detection method basedon compound session, and then shows the effectiveness of the proposed method. A further objective of this method is to identify the infected host.


  • Federating Web-Based Applications on a Hierarchical Cloud                                                                                                                                                                                                                                                                     Cloudbased infrastructures enable applications to collect and analyze massive amounts of data. Sometimes these applications are the product of green-field engineering, but frequently they are the product of the evolution of traditional RDBMS-based implementations. In any case, NoSQL databases, endowed with high availability, elasticity and scalability through their easy deployment on cloud-computing platforms, have become an attractive data-storage solution for these big-dataapplications. Unfortunately, to date, there is little methodological and tool support for migrating existing applications to these new platforms. In this paper, we describe a hybrid architecture for location-aware applicationson hierarchical cloud, a methodology for mapping relational (including spatio-temporal) data to HBase, and a process for migrating legacyapplications to the new architecture.


  • LSD2H: A Novel Storage Method of Linked Sensor Data Based on HBase                                                                                                                                                                                                                                                 With the development of sensor network, the number of sensors are increasing now. In order to manage sensor data, W3C has proposed semantic sensor web which unifies sensor network data and produces massive RDF dataset. Linked Sensor Data is a normal dataset of semantic sensor web and links into Linked Open Data based on SSN ontology. HBase is a kind of distributed database and suits for the management of Linked Sensor Data applicably. So we propose LSD2Hwhich is a storage method of Linked Sensor Data by use of HBase.LSD2H architecture consists of SDS, ODS, SD2H, OD2H and SCQuery. SDS is a table in HBase for storing sensor data. ODS is a table in HBasefor storing observation data. SD2H maps Linked Sensor Data to SDS, while OD2H maps Linked Sensor Data to ODS by MapReduce. SCQuerybased on tree pattern and recursion algorithm is a query method about SDS and ODS. In order to deduce storage space and improve query performance, we analyze the compression algorithms of LZO, GZIP, Snappy and no compression, which has proved that LZO selected by using storage of data is benefit for LSD2H in final.


  • Blinked Data: Concepts, Characteristics, and Challenge                                                                                                                                                                                                                                                                             Big Data refers to a large and complex data. It has four characteristics: volume, variety, velocity, and veracity. Typically, there are different types of Big Data: structured, semi-structured, and unstructured out of which the last two pose challenges in applications such as query processing. Especially, the query processing on semi-structured data is enormouslychallenging. Big Data and its characteristics have been documented in a large volume of literature however a comprehensive discussion of thecharacteristics of a specific type of Big Data is missing. Therefore, a solid understanding of these characteristics is sine quo non to process complex queries efficiently. Big Data is a generic term. We do not categorise Big Data in this paper instead we focus only on Big LinkedData which we called Blinked Data. It is a variant of Big semi-structureddata which has a set of characteristics that are critical to modeling and processing. In this paper, we investigate the characteristics andchallenges of Blinked Data. This paper aims to provide a comprehensive description of the conceptBlinked Data‘. In addition, this research presents the challenges in processing queries on Blinked Data through an empirical study.


  • Cloud computing: Need, enabling technology, architecture, advantages and challenges                                                                                                                                                                                                           Commercially popularized with the year 2002 with the launch of Amazon web services cloud computing has changed the way IT services and resources are delivered to the customers. With its varied platforms like SaaS, PaaS and IaaS it has made available resources which was once never available on demand and scalable manner. With the advantages of high scalability and flexibility, excellent reliability and availability and with no upfront cost in procuring and managing IT infrastructure, it is widely adopted by organizations. This paper is aimed at covering cloudcomputing from overall perspective. It cover basics of Cloud computing, the service and deployment models used today, the components of cloudcomputing, the need and working of cloud computing, cloud computingreference model, enabling technologies, challenges and advantages ofcloud computing.


Similiar Hadoop Project Topics

Work Progress

PHD - 24

M.TECH - 125

B.TECH -95

BIG DATA -110.


ON-GOING Hadoop Projects





Achievements – Hadoop Solutions


Twitter Feed

Customer Review

Hadoop Solutions 5 Star Rating: Recommended 4.9 - 5 based on 1000+ ratings. 1000+ user reviews.