Engineering Efficient Similarity Search Methods
MetadataShow full item record
Similarity search has become one of the important parts of many applications including multimedia retrieval, pattern recognition and machine learning. In this problem, the main task is to retrieve most similar (closest) objects in a large data set to a query efficiently by using a function that gives the distance between two objects. This thesis is presented as a collection of seven papers with a tutorial on the topic and is mainly focused on efficiency issues of similarity search. • Paper I provides a survey on state-of-the-art approximate methods for metric and non-metric spaces. • Paper II proposes an approximate indexing method for the non-metric Bregman divergences. • Paper III shows how to transform static indexing methods into dynamic ones. • Paper IV presents a learning method to prune in metric and non-metric spaces. • Paper V improves the existing approximate technique that can be applied to ball-based metric indexing methods. • Paper VI presents an open source cross-platform similarity search library and a toolkit called the “Non-Metric Space Library” (NMSLIB) for evaluation of similarity search methods. • Paper VII evaluates a metric index for approximate string matching.
Has partsPaper 1: Naidan, Bilegsaikhan; Boytsov, Leonid; Nyberg, Eric. Permutation Search Methods are Efficient, Yet Faster Search is Possible.. Proceedings of the VLDB Endowment 2015 ;Volum 8.(12) s. 1618-1629
Paper 2: Naidan, Bilegsaikhan; Hetland, Magnus Lie. Bregman hyperplane trees for fast approximate nearest neighbor search. International Journal of Multimedia Data Engineering and Management (IJMDEM) 3(4) Is not included due to copyright available at http://dx.doi.org/10.4018/jmdem.2012100104
Paper 3: Naidan, Bilegsaikhan; Hetland, Magnus Lie. Static-to-dynamic transformation for metric indexing structures (extended version). Information Systems 2014 ;Volum 45. s. 48-60
Paper 4: Boytsov, Leonid; Naidan, Bilegsaikhan; Learning to Prune in Metric and Non-Metric Spaces - Part of: Advances in Neural Information Processing Systems 26 (NIPS 2013)
Paper 5: Naidan, Bilegsaikhan; Hetland, Magnus Lie. Shrinking data balls in metric indexes - The 5th International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA), 2013.
Paper 6: Boytsov, Leonid; Naidan, Bilegsaikhan; Engineering Efficient and Effective Non-metric Space Library - International Conference on Similarity Search and Applications, SISAP 2013: Similarity Search and Applications pp 280-293, Lecture Notes in Computer Science, vol 8199, http://dx.doi.org/10.1007/978-3-642-41062-8_28 The final publication is available at link.springer.com.
Paper 7: Naidan, Bilegsaikhan; Hetland, Magnus Lie. An empirical evaluation of a metric index for approximate string matching. Norsk Informatikkonferanse (NIK), 2012