Hierarchical Link and Code: Efficient Similarity Search for Billion-Scale Image Sets
Loading...
Date
2021
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association
Abstract
Similarity search is an indispensable component in many computer vision applications. To index billions of images on a single commodity server, Douze et al. introduced L&C that works on operating points considering 64-128 bytes per vector. While the idea is inspiring, we observe that L&C still suffers the accuracy saturation problem, which it is aimed to solve. To this end, we propose a simple yet effective two-layer graph index structure, together with dual residual encoding, to attain higher accuracy. Particularly, we partition vectors into multiple clusters and build the top-layer graph using the corresponding centroids. For each cluster, a subgraph is created with compact codes of the first-level vector residuals. Such an index structure provides better graph search precision as well as saves quite a few bytes for compression. We employ the second-level residual quantization to re-rank the candidates obtained through graph traversal, which is more efficient than regression-from-neighbors adopted by L&C. Comprehensive experiments show that our proposal obtains over 30% higher recall@1 than the state-of-thearts, and achieves up to 7.7x and 6.1x speedup over L&C on Deep1B and Sift1B, respectively.
Description
@inproceedings{10.2312:pg.20211397,
booktitle = {Pacific Graphics Short Papers, Posters, and Work-in-Progress Papers},
editor = {Lee, Sung-Hee and Zollmann, Stefanie and Okabe, Makoto and Wünsche, Burkhard},
title = {{Hierarchical Link and Code: Efficient Similarity Search for Billion-Scale Image Sets}},
author = {Yang, Kaixiang and Wang, Hongya and Du, Ming and Wang, Zhizheng and Tan, Zongyuan and Xiao, Yingyuan},
year = {2021},
publisher = {The Eurographics Association},
ISBN = {978-3-03868-162-5},
DOI = {10.2312/pg.20211397}
}