MultiIndex
How to search in a multi-index

Algorithm

Given a query point, the search algorithm traverses the inverted multi-index entries in the order of increasing distance from the entry centroid to the query as descibed in the paper.
It accumulates the points from the traversed entries and stops when their count reaches the value requested by the user.
If rerank mode is on (flag --do_rerank is set) the code also estimates the distance to query for every traversed point using the extra information stored for reranking. In this case, the traversed points are sorted by the increasing distance estimate.

As for the index construction, you should provide the coarse vocabularies and the fine vocabularies (these should be the same vocabularies as used for the index construction). To measure the accuracy of the system, you should provide a file with a list of query points and the ground truth nearest neighbors.

File formats

Indexing sample

To launch the search for all queries and to estimate the accuracy of the search algorithm run the "searcher_tester" application. The following command-line options control the execution:

--coarse_vocabs_file        - the path to the file with coarse vocabs (see the format description above)
--fine_vocabs_file          - the path to the file with fine vocabs for reranking(see the format description above)
--query_point_type          - "BVEC" or "FVEC"
--use_residuals             - the reranking method flag. Specify it if you are using the residuals for reranking (Multi-D-ADC) and omit it if you are using the initial vector (Multi-ADC)
--space_dim                 - space dimensionality (e.g. 128 for SIFTs)
--subspaces_centroids_count - the number of nearest vocabulary items to consider (L in the paper)
--index_files_prefix        - the common prefix of the multi-index files (to control runs with different parameters)
--queries_file              - the path to the file with queries (should be in .bvecs or .fvecs format)
--groundtruth_file          - the path to the file with groundtruth (should be in .ivecs format)
--queries_count             - the number of queries to search
--neighbours_count          - the number of neighbours involved in reranking
--report_file               - the path to the file to store the search quality report
--do_rerank                 - this flag indicates whether the search algorithm should rerank points based on the estimated distance to the query

Windows users can try test_searcher.bat script. It launches search in the index builded by launch_indexer.bat script. Unix users should just write a similar test_searcher.sh script.

 All Classes Files Functions Variables Typedefs Enumerations Enumerator