Distributed Binary Subspace Learning on large-scale cross media data
Distributed Binary Subspace Learning on large-scale cross media data.Due to the ubiquitous existence of large–scale data in today’s real-world applications including learningon cross media data, we propose a semi-supervised learning method named Multiple Binary SubspaceRegression (MBSR) for cross media data classification. In order to mine the common features among the data with multiple modalities, we project the original cross–media data into the same low-rank representation simultaneously by mapping to the corresponding subspaces for dimension reduction.
All the subspaces are set to be binary, which only involve the addition operations and omit the multiplication operations in the subsequent computation owing to the good property of the binary values. The dimension reduction to a binary subspace and the classification on this subspace are also optimized simultaneously leading to a semi-supervised model. For dealing with large–scale data, our learningmethod is easily implemented to run in a MapReduce-based Hadoop system. Empirical studies demonstrate its competitive performance on convergence, efficiency, and scalability in comparison with the state-of-the-art literature.
Similar IEEE Project Titles
- A Processing Pipeline for Cassandra Datasets Based on Hadoop Streaming
- Scaling Hadoop clusters with virtualized volunteer computing environment
- HaSTE: Hadoop YARN Scheduling Based on Task-Dependency and Resource-Demand
- A round robin with multiple feedback job scheduler in Hadoop
- A Hadoop Based Weblog Analysis System