Using machine learning for exploratory data analysis and predictive modeling
MetadataShow full item record
- Master's theses (TN-IDE) 
Exploratory data analysis and predictive analytics can be used to extract hidden patterns from data and are becoming increasingly important tools to transform data into information. Machine learning has become a powerful technique for predictive analytics, it can directly predict the dependent variable without focusing on the complex underlying relationships between predictors. Oil and gas industries has found these techniques very useful in their business such as oil well production prediction and equipment failure forecasting. Our work intends to build a predictive model based on data which can produce precise predictions and is efficient in practice. With this work we follow a methodology to build predictive models based on real data. The experiments focus on three machine learning algorithms, which are linear regression, neural network and k-nearest neighbors. Within each category, experiments are carried out on multiple model variants in order to achieve a better performance. The built models have been tested on new data and cross-validation has been performed to validate the models. The predictive performance of each model is evaluated through R-squared and root-mean-squared error (RMSE) parameters and comparison of predicted values and actual values. Experiment results shows that nearest neighbor with k-dimensional tree is the most efficient model with best predictive performance in this case. This model can be a possible solution to help the expert in making prediction relying on the data.
Master's thesis in Computer science