Searching large data volumes with MISD processing
MetadataVis full innførsel
Historically, supercomputing has focused on number crunching. Nonnumeric applications, such as information retrieval and analysis, have to a lesser extent been able to exploit the inherent resources of supercomputers. This thesis presents the results from the development of a novel multiple instruction, single data (MISD) architecture, targeting evaluation of complex queries in large data volumes. For such applications, this architecture provides a better price versus performance ratio, better use of the available memory bandwidth, lower power consumption, as well as linear scalability. The core element of this technology is the Pattern Matching Chip (PMC). Each chip provides 1024 processing elements, with an accumulated performance of 1010 operations per second. Multiple chips can be run in parallel with linear scalability, either within one computer, or in larger clusters. Up to half a million processing elements have been used in parallel in this project, providing 5 · 1013 operations per second at 48 GB per second data throughput, in a unit smaller than one cubic meter. Even larger systems can be constructed, still with linear scalability. Through the novelty of this hardware architecture, the performance gained has enabled information processing in a way that would have been cost prohibitive with traditional computers. Such processing has demonstrated the capability of finding nuggets of valuable data in large and complex data volumes. The main effort – thus also the most important practical results – has been in bioinformatics. However, the technology has applicability in numerous other data mining applications.
Består avHalaas, A; Svingen, B; Nedland, M; Saetrom, P; Snove, O. Jr; Birkeland, Olaf René. A recursive MISD architecture for pattern matching. IEEE Transactions on Very Large Scale Integration Systems. 12(7): 727-734, 2007.
Birkeland, Olaf René; Snove, O; Halaas, A; Nedland, M; Saetrom, P. The Petacomp Machine - A MIMD Cluster for Parallel Pattern-mining. 2006 IEEE International Conference on Cluster Computing: 1-10, 2006.
Snove, O. Jr; Humberset, Håkon; Birkeland, Olaf René; Sætrom, P. Sequence Explorer: interactive exploration of genomic sequence data. .
Snøve, O. Jr; Nedland, M; Fjeldstad, Ståle H; Humberset, Håkon; Birkeland, Olaf René; Grünfeld, Thomas; Sætrom, P. Designing effective siRNAs with off-target control. Biochemical and Biophysical Research Communications. 325(3): 769-773, 2004.
Sætrom, P; Birkeland, Olaf René; Snøve, O. Jr. Boosting Improves Stability and Accuracy of Genetic Programming in Biological Sequence Classification. Genetic Programming Theory and Practice IV: 61-78, 2007.
Birkeland, Olaf René; Snøve, O. Jr. The Pattern Matching Chip. Interagon AS, 2002.