Efficient Region Based Indexing and Retrieval for Images with
Discriminative Relevance Feedback With
Representation For Efficient Image Retrieval
Suman Karthik and C.V.Jawahar
Multimedia data is growing exponentially.
Cheap high quality digital imaging devices
Sharing of multimedia data on the internet
Content based organization and retrieval is a viable
way of accessing this data.
What do users look for?
*In CBIR systems users look for ‘things’ not ‘stuff’.
More than global image properties
Traditional object recognition won’t work
Rely on text to identify objects
Look at regions (objects or parts of objects)
The region based image retrieval paradigm was
successfully applied by Carson et al. in
Blobworld[99-2004] and Wang et al. in
*Chad Carson: Blobworld
Practical CBIR systems
Practical large scale deployment of CBIR
Efficient indexing and retrieval of thousands of
Flexible framework for retrieval based on various
types of features. For example Specialized
features for highly specific sub domains like
faces, vehicles, monuments.
Ability to scale up to millions of images without a
significant performance trade off.
Lessons From Text Retrieval
Large scale text retrieval systems have been
Search Engines like
Efficient indexing and retrieval of millions of
documents has been achieved.
The text retrieval frameworks are adaptive enough
to be applied to specialized domains.
Virtual Textual Representation
Images as text documents.
Color (YUV), compactness and location of segment
are used to encode the segment as text.
by strings or words
Quantization can be achieved in a number of
Uniform vector space quantization for data set with
a uniform feature point distribution.
Density based quantization of the feature space can
be achieved with simple k-means quantization.
Irrespective of the quantization applied each cell
in the vector space has a representative string.
Each image segment is assigned to a cell and is
assigned the representative string of the cell.
Discriminative Relevance Feedback
Discriminative regions are
given higher weight than
Image segments that can
differentiate between roses
and other flowers are given
higher weight with respect
to the class roses.
Regions aiding classification
rather than clustering are
Image segments containing
humans are able to
differentiate between ‘surfer’
and ‘wave’ images.
Intuitive way of learning content
Over segmentation and subsequent deduction of content
can be achieved if the problem is modeled like this.
feedback consistently out
performed Region based
Given are the precision data
for discriminative relevance
feedback and Bayesian
CBIR systems usually use spatial databases to index and
Blobworld uses variants or R-trees. *[Megan Thomas, Chad Carson,
Joseph M. Hellerstein Creating a Customized Access Method for
Blobworld (2000) ICDE]
Relevance feedback skews the feature space rendering spatial
databases inefficient. *[peng et al. Kernel Indexing for Relevance
Feedback Image Retrieval]
Elastic Bucket Trie
Spatial data structures VS EBT
Spatial Data Structures
Elastic Bucket Trie
Become inefficient when used with
Not effected by relevance feedback
Requires costly arithmetic
Requires bit opertations
Number of splits of the spatial data
structure is not fixed
Number of splits of EBT is limited.
Strictly adheres to spatial
characteristics of the feature space.
The trie need not adhere to an
underlying spatial structure though
that is also possible.*
Suman Karthik, C.V. Jawahar, Efficient Region Based Indexing and Retrieval
for Images with Elastic Bucket Tries, ICPR(2006)
Relevance feedback and EBT
Typical relevance feedback algorithms need
to be modified to work with text.
Keywords emerge with relevance feedback
signifying association between key segments.
EBT can be used without any modifications
with discriminative relevance feedback.
Bag of words
The scheme is very similar to contemporary
bag of words approaches.
Interest point based bag of words
approaches can also be adapted to work
within our framework.*
Any type of vector quantization of the
feature space used by these schemes can
*R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from
google's image search. ICCV, 2005.
Usage of heterogeneous strings to describe
Text encoding of images that is highly
Text encoding of images that is robust to
Text based image mining to discover
concepts and their features.