Online failure prediction in UNIX systems
MetadataShow full item record
This thesis investigates the possibility of enhancing an existing performance monitoring system for UNIX servers, by adding the capability of predicting upcoming failures, using generic UNIX operating system performance metrics like used server memory, CPU utilization, I/O traffic etc. as input data for machine learning and pattern recognition. In this thesis we survey possible research methods based on input data they process, and propose a novel approach for symptom based failure predicting. In order to make a generic solution that can be used on any UNIX computer, we have only used open source software. We evaluate the classifiers Naive Bayes and Logistic Regression with input data in both standard and vectorized format. Furthermore we use the search algorithm Forward stepwise selection to find an optimal generic set of variables (features) that improves the quality of the classification. Our empirical testing demonstrates that our proposed method is capable of predicting symptoms with high overall accuracy, but the uncertain quality of the monitored performance data used as input makes it difficult to ascertain if the symptoms are actually failures. Applying the search algorithm for feature selection and vectorizing the input data set we improved the time for classification with an order of magnitude. In our opinion the proposed technique for online failure prediction will benefit to applications concerning performance monitoring and contribute to the research field of online failure prediction with new insight.
Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2011 – Universitetet i Agder, Grimstad