Similarity Search in High-Dimensional Vector Spaces

No image set
Author/s:
R. Weber
Pages:
239
EAN/ISBN:
978-3-89838-474-2
Volume:
74
Binding:
softcover
Book Series:
Dissertationen zu Datenbanken und Informationssystemen
Kategorien:
Book
Computer Sience
Informationstechnologie und Informationssysteme
Dissertationen zu Datenbanken und Informationssytemen
English
Complete Index AKA Publisher
Price:
38,00 €
inkl. 7% Tax
With the example of an image database and the notion of relevance feedback, the dissertation addresses the important and challenging problem of identifying the most similar objects in a database given a set of reference objects and a set of features. Similarity search is typically implemented as "Nearest Neighbour Search" (NN- Search) in high-dimensional vector spaces. Although a large number of existing work have provided solutions for the NN-Search problem, most of them have not taken into account the specific problems of high dimensionality explicitly. This work carefully investigates the so-called "Curse of Dimensionality", and presents a novel, flat organization for NN-Search optimized for high-dimensional spaces: the so-called "Vector Approximation File" (VA-File). The superiority of the VA- File over conventional indexing structures is shown theoretically and with extensive experiments. Further refinements of the VA-File discussed in the current work include approximate search and parallel search in a cluster of workstations. From a practical perspective, the dissertation provides an indexing technique that allows for interactive-time similarity search even in huge databases with gigabytes of vector data. As such, it is the most favourable indexing technique for future multimedia retrieval systems.