Gradient descent algorithm incorporating stochastic pointlocation schemes and its application in multidimensional scaling analysis
MetadataShow full item record
Gradient descent (GD) is a popular approach for solving optimisation problems. A disadvantage of the method is the di culties with choosing a proper learning rate that sets the step size for the algorithm. If the learning rate is too small, the convergence is unnecessarily slow, whereas if the learning rate is too large, the process will overshoot or even diverge. Solving optimisation problems that are stochastic in nature introduces additional challenges to the GD algorithm hindering the convergence process. According to , the use of Stochastic Point Location (SPL) schemes can potentially improve any optimisation algorithm by learning the best parameter (the learning rate for GD) during optimisation. In this thesis we have enhanced the classical Gradient descent algorithm by incorporation Linear SPL and Hierarchical SPL schemes. As a result, two new GD based algorithms, GD-LSPL and GD-HSPL, have been developed. The new algorithms iteratively determine the learning rate of the GD in stochastic environments. To evaluate whether GD-LSPL and GD-HSPL algorithms improve the performance of the GD based optimisation we have compared them with GD with constant learning rate (GD-Classic) and with GD with Line search (GD-LS) algorithms. Three di erent environments have been used for the empirical evaluation. They are minimising mathematical functions, arti cial Multidimen- sional Scaling (MDS) problems as well as a practical MDS problem concerning the classi cation of Word-Of-Mouth (WoM) discussions. MDS is a statistical technique used in information visualisation for exploring similarities or dissimilarities in data. Note that we have proposed and empirically evaluated three di erent schemes for applying GD- LSPL and GD-HSPL algorithms when identifying the learning rate in the GD based MDS. In brief, the schemes consist of either identifying a global learning rate that is applied to all the points, an unique learning rate for each individual point, or an unique learning rate for every pair of points. These proposed schemes have been rigorously evaluated for arti cial MDS problems and for classifying WoM discussions. The observed results are conclusive | GD-LSPL and GD-HSPL algorithms introduce a clear improvement compared to GD-Classic. Since GD-LSPL and GD-HSPL regulate the learning rate during execution, they achieve faster and more accurate convergence without any manual adjustment of parameters. While dealing with minimising mathematical functions, we have introduced randomness to the problem data. In these environments GD-HSPL is the fastest and the most accurately converg- ing, the least in uenced by variation in the start value and the most frequently converging to the global minimum algorithm. In solving arti cial MDS problems the most accurately converging algorithms are GD-LS and GD-HSPL applying learning rate point-wise. These algorithms also perform quite well in Classifying WoM discussions. However, in the latter environment, when the number of points increases, GD-LSPL with the learning rate applied pair-wise has shown the fastest performance. In solving MDS problems randomness is located in the algorithm itself. GD-LS handles ran- domness in the algorithm quite well and improvement achieved by GD-HSPL and GD-LSPL is statistically insigni cant. However, under certain conditions GD-HSPL and GD-LSPL algo- rithms have shown potential in improving the performance of GD-LS in solving MDS problems. Moreover, they obviously have shown a great improvement in minimising mathematical func- tions. We, therefore, believe that GD-HSPL and GD-LSPL algorithms can be utilised in various GD based applications, particularly when randomness of the task is located in the problem data.
Masteroppgave i informasjons- og kommunikasjonsteknologi 2010 – Universitetet i Agder, Grimstad