On Modelling and Prediction of Total CPU Usage for Applications in MapReduce Environments

Select |




Print


Babaii Rizvandi, Nikzad; Zomaya, Albert; Moraveji, Reza; Taheri, Javid


2012-08-06


Conference Material


International Conference on Algorithms and aAchitectures for Parallel Processing (ICA3PP)


Japan


414-427


In this paper, we present an approach to model and predict the total CPU utilization –in terms of CPU clock tick– of MapReduce applications for almost fixed-sized input data. Our approach has two key phases: profiling and modelling. In the profiling phase, an application is run several times with different sets of MapReduce configuration parameters to profile total CPU clock ticks of the application on fixed-sized input data. In the modelling phase, polynomial regression is used to map the sets of MapReduce configuration parameters (number of mappers, and number of reducers) to total CPU clock ticks of the application. This derived model can then be used to (1) predict total CPU requirements of an unseen experiment of the same application with similar input data sizes, and (2) automatically tweak MapReduce configuration parameters for effective running of other applications. Our approach aims to eliminate error-prone manual processes and presents a fully automated solution. Three standard applications (WordCount, Exim MainLog parsing and TeraSort) are used to evaluate our modelling technique on 20 virtual nodes on a private cloud. Results show that the prediction error of unseen experiments between our model and real data is up to 1.59%, 2.28% and 7.26% for WordCount, Exim MainLog, and TeraSort, respectively.


CPU utilization, CPU clock tick, MapReduce, Modelling, Prediction, Regression, Configuration parameters, Hadoop


http://anss.org.au/ica3pp12/


nicta:5755


Babaii Rizvandi, Nikzad; Zomaya, Albert; Moraveji, Reza; Taheri, Javid. On Modelling and Prediction of Total CPU Usage for Applications in MapReduce Environments.[Conference Material]. 2012-08-06. <a href="http://hdl.handle.net/102.100.100/99852?index=1" target="_blank">http://hdl.handle.net/102.100.100/99852?index=1</a>



Loading citation data...

Citation counts
(Requires subscription to view)