On Using Pattern Matching Algorithms in MapReduce Applications

Select |


Babaii Rizvandi, Nikzad; Taheri, Javid; Zomaya, Albert


Conference Material

The 9th IEEE International Symposium on Parallel and Distributed Processing with Applications

Busan, Korea


In this paper, we study CPU utilization time pattern of applications on Mapreduce cluster. After extracting the pattern for several applications and keeping them in Reference database, we compare the CPU utilization time pattern of a new incoming application to the CPU utilization time patterns of applications in Reference database in order to find the most similar pattern to the new application’s pattern. Due to the different length of patterns, Dynamic Time Warping (DTW) is utilized for comparison. Then correlation analysis is applied to the outcome of DTW to make patterns’ similarity feasible (between %0 to %100). We expect that if the patterns of two applications be too similar for several different sets of Mapreduce configuration parameters values (above %90), then it is too likely the applications follow the same pattern for every set of configuration parameters values. Therefore, it gives us a technique to classify the applications in several classes. Also, if the optimal values of configuration parameters for an application are obtained, these values can also be used for optimal running of other applications in the same class. Our evaluation on three real applications (WordCount, Exim Mainlog parsing and Terasort) on pseudo-distributed mode MapReduce shows that effectiveness of our approach

Mapreduce, Pattern Matching, Configuration parameters



Babaii Rizvandi, Nikzad; Taheri, Javid; Zomaya, Albert. On Using Pattern Matching Algorithms in MapReduce Applications.[Conference Material]. 2011-05-28. <a href="http://hdl.handle.net/102.100.100/104280?index=1" target="_blank">http://hdl.handle.net/102.100.100/104280?index=1</a>

Loading citation data...

Citation counts
(Requires subscription to view)