Feature selection in Medline using text and data mining techniques
MetadataVis full innførsel
In this thesis we propose a new method for searching for gene products gene products and give annotations associating genes with Gene Ontology codes. Many solutions already exists, using different techniques, however few are capable of addressing the whole GO hierarchy. We propose a method for exploring this hierarchy by dividing it into subtrees, trying to find terms that are characteristics for the subtrees involved. Using a feature selection based on chi-square analysis and naive Bayes classification to find the correct GO nodes.