Image segmentation and multivariate analysis in two-dimensional gel electrophoresis
MetadataVis full innførsel
- Institutt for kjemi 
The topic of this thesis is data-analysis on images from two-dimensional electrophoretic gels. Because of the complexity of these images, there are numerous steps and approaches to such an analysis, and no “golden standard” has yet been established on how to produce the desired output. In this thesis focus is put on two essential fields concerning 2D-gel analysis; registration of images by segregation and protein spot identification, and data-analysis on the output of such a registration by multivariate methods. Image segmentation is mainly concerned with the task of identifying individual protein spots in a gel-image. This has generally been the natural starting point of all methods and procedures developed since the introduction of 2D-gels in the mid-seventies, simply because this best reproduces the results created by a human analyst, who manually identify protein-spot entities. The amount of data produced in a 2D-gel experiment can be quite large, especially in multiple gels where the human analyst is dependent on additional statistical data-analytical tools to produce results. Because of the correlated nature of most gel-data, analysis by multivariate methods is natural choice, and are therefore adopted in this thesis. The goal of this thesis is to introduce the above mentioned procedures at different stages in the analysis pipeline where they are not yet fully exploited, rather than to improve already existing algorithms. In this way new insight and ideas on how to handle data from 2D-gel experiments are achieved. The thesis starts with a review of segmentation methodology, and introduces a selected procedure used to identify protein spots throughout. Output from the segmentation is then used to create a multivariate spot-filtering model, which aims to separate protein spots from noise and artefacts often creating problems in 2D-gel analysis. Lately the use of common spot boundaries in multiple gels have been the method of choice when gels are analysed. How such boundaries should be defined is an important subject of discussion, and thus a new method for defining common boundaries based on the individual segmentation of each gel is introduced. Segmentation may be a natural starting point when gels are analysed, but it is not necessarily the most correct. Often the introduction of fixed spot entities introduces restrictions to the data which cause problems at later stages in the analysis. Analysing pixels from multiple gels directly has no such restrictions, and it is shown in this thesis that the output of such an analysis based on multivariate methods can produce very useful results. It can also give insight to the data problematic to achieve with the spot boundary approach. At last in the thesis an improved pixel-based approach is introduced, where a less restricted segmentation is used to reduce and concentrate the amount of data analysed, improving the final output.