## The use of logistic regression and quantile regression in medical statistics

##### Master thesis

##### Permanent lenke

http://hdl.handle.net/11250/2451786##### Utgivelsesdato

2017##### Metadata

Vis full innførsel##### Samlinger

##### Sammendrag

The main goal of this thesis is to compare and illustrate the use of logistic regression and quantile regression on continuous outcome variables. In medical statistics, logistic regression is frequently applied to continuous outcomes by defining a cut-off value, whereas quantile regression can be applied directly to quantiles of the outcome distribution. The two approaches appear different, but are closely related. An approximate relation between the quantile effect and the log-odds ratio is derived. Practical examples and illustrations are shown through a case study concerning the effect of maternal smoking during pregnancy and mother's age on birth weight, where low birth weight is of special interest. Both maternal smoking during pregnancy and mother's age are found to have a significant effect on birth weight, and the effect of maternal smoking is found to have a slightly larger negative effect on low birth weight than for other quantiles. Trend in birth weight over years is also studied as a part of the case study. Further, the two approaches are tested on simulated data from known probability density functions, pdfs. We consider a population consisting of two groups, where one of the groups is exposed to a factor, and the effect of exposure is of interest. By this we illustrate the quantile effect and the odds ratio for several examples of location, scale and location-scale shift of the normal distribution and the Student t-distribution.
Through this thesis we find that quantile regression often yields an easier interpretation of the estimated effects due to the estimated parameters being on the same measuring scale as the dependent variable of interest. In addition, quantile regression provides easier comparisons of effects in different quantiles of the distribution, where the logistic regression model may easily lead to misinterpretations.