Switch-SSD cache based XML query processing in Hadoop
Switch-SSD cache based XML query processing in Hadoop.Hadoop as open source software that implements the MapReduce framework is an ideal solution to speed up a XML parallel query processing. We proposed a distributed caching architecture in Hadoop cluster, called switch-SSD which cache XML query results en-route in the network switching nodes. Switch-SSD extends extend OpenFlow switches limited memory space with SSD for caching XML query results in the switch.
We design an OpenFlow controller as a cache Manager conducting the switch-SSDs. At the help of the controller, the switch-SSD intercepts the query request and proactively sends the caching results to the client rather than a client conducts cache read operation. By caching the results, switch-SSD reduces calculation of query and lowers the job execution times in Hadoop cluster. Experimental results show that switch-SSD can improve the efficiency of most existing XML parallel query processing in Hadoop cluster.