Scalable Object-Class Retrieval with
Approximate and Top-k Ranking
Mohammad Rastegari*, Chen Fang*, Lorenzo Torresani
*: equal contributions
Abstract:
In
this paper we address the problem of object-class retrieval in large image data
sets: given a small set of training examples defining a visual category, the
objective is to efficiently retrieve images of the same class from a large
database. We propose two contrasting retrieval schemes achieving good accuracy
and high efficiency. The first exploits sparse classification models expressed
as linear combinations of a small number of features. These sparse models can
be efficiently evaluated using inverted file indexing. Furthermore, we
introduce a novel ranking procedure that provides a significant speedup over
inverted file indexing when the goal is restricted to finding the top-k (i.e., the k highest ranked) images in the data set. We
contrast these sparse retrieval models with a second scheme based on
approximate ranking using vector quantization. Experimental results show that
our algorithms for object-class retrieval can search a 10 million database in
just a couple of seconds and produce categorization accuracy comparable to the
best known class-recognition systems.
Paper:
Mohammad Rastegari*, Chen Fang*, Lorenzo Torresani. Scalable Object-Class Retrieval with Approximate and Top-k Ranking. ICCV, 2011. [PDF][bibtex]
Results:
The above figure shows class-retrieval
precision versus search time performance for the 15,000 classes: x-axis is search time; y-axis shows percentage of
positives ranked in the top 10. We used 400
randomly selected categories as query classes. For each of these classes, we
capped the number of true positives in the database to be 450. The total number
of distractors of each query is 9,671,611. We use the binary classeme
descriptor [1] to compactly represent each image as a vector of 2659 bits. We
trained classifiers for each query category using a training set consisting of
10 positive examples and 15,202 negative examples. The curve for each method is
obtained by varying parameters controlling the accuracy-speed tradeoff. From
this figure, we can see that AR and TkP provide considerable speedups over
inverted file indexing.
How much accuracy have we sacrificed for the sake of this efficiency? This question is answered in the table above where we compare the retrieval accuracy of our approaches with the state-of-the-art class-recognition system of Gehler and Nowozin [2], the LP- classifier. We use as low-level features for LP- classifier the same 13 descriptors that we used to learn the classemes. We train the retrieval models on each Caltech256 class separately by choosing 50 positive examples of the query category and 255 negative examples obtained by sampling one image from each of the other categories. Precision at 25 are evaluated on a subset of Caltech256, which contains 6400 images (25 images per category). The table shows that our efficient retrieval models achieve accuracy comparable to that of the much more computationally expensive LP- classifier.
Software:
Code for the Top-k ranking method is available here.
Acknowledgements:
This material is based upon
work supported by the National Science Foundation under CAREER award
IIS-0952943. Any opinions, findings and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation (NSF).
References:
[1] Lorenzo
Torresani, Martin Szummer and Andrew W. Fitzgibbon, "Efficient Object
Category Recognition Using Classemes",
ECCV 2010
[2] Peter V. Gehler
and Sebastian Nowozin, "On feature combination for multiclass object
classification", ICCV 2009