## Data-Driven Analysis of Vessel Performance

##### Master thesis

##### Permanent lenke

http://hdl.handle.net/11250/2452266##### Utgivelsesdato

2017##### Metadata

Vis full innførsel##### Samlinger

- Institutt for marin teknikk [1579]

##### Sammendrag

In this thesis the relation between vessel performance and various vessel and environmental variables were investigated using a data-driven approach. A total of 12 variables such as speed over ground and days since drydock were considered with data for almost three years. The performance loss of the vessel were calculated by measuring the vessel performance and comparing it to an expected performance, calculated by the use of computational fluid dynamics. The relation between performance loss and time were investigated in particular to assess the hull and propeller performance of the vessel. Statistical models were trained to predict the performance loss from the 12 variables. The models were analyzed to assess the relative importance of the different variables.
The relevant data were extracted and put on a suitable format. After this the data were preprocessed by the use of synchronization, variable redefinitions, outlier removal, mean centering and normalization. The prepared dataset were analyzed using principal component analysis to reveal structures in the unlabeled dataset, and to verify known relations.
Performance loss were simulated for three different cases. Several statistical learning methods as well as outlier removal were preformed on the simulated models. This was done to verify that the methodology would reveal the relationships in the simulated data and such that we could compare the simulated models with the real-world data. Both the linear and non-linear regression models were able to uncover the relationships in the simulated data, and improved the prediction error rate by as much as 86.8 \% for the most complex simulated model.
The same methodology used on the simulated models were applied to the real-world data. A second degree polynomial regression model reduced the prediction error rate by 97.8 \%, better then expected. The non-linear nearest-neighbor regression only reduced the prediction error rate by 66.2 \%. The variables that were most important in the least-squares regression model were the variables related to the propulsion system of the vessel. When finding the best subset of variables, the propulsion variables were always present. The time variables where not able to reduce the prediction error rate significantly and it was impossible to draw any strong conclusions on the effect of time on the performance of the vessel. Thus, no prognosis model which can be utilized in maintenance could be made.