Content-Based Image Retrieval by Clustering

Yixin Chen
University of New Orleans, New Orleans, LA 70148

James Z. Wang
The Pennsylvania State University, University Park, PA 16802

Robert Krovetz
Teoma Technologies, Piscataway, NJ 08554

Abstract:

In a typical content-based image retrieval (CBIR) system, query results are a set of images sorted by feature similarities with respect to the query. However, images with high feature similarities to the query may be very different from the query in terms of semantics. This is known as the semantic gap. We introduce a novel image retrieval scheme, CLUster-based rEtrieval of images by unsupervised learning (CLUE), which tackles the semantic gap problem based on a hypothesis: semantically similar images tend to be clustered in some feature space. CLUE attempts to capture semantic concepts by learning the way that images of the same semantics are similar and retrieving image clusters instead of a set of ordered images. Clustering in CLUE is dynamic. In particular, clusters formed depend on which images are retrieved in response to the query. Therefore, the clusters give the algorithm as well as the users semantic relevant clues as to where to navigate. CLUE is a general approach that can be combined with any real-valued symmetric similarity measure (metric or nonmetric). Thus it may be embedded in many current CBIR systems. Experimental results based on a database of about 60,000 images from COREL demonstrate improved performance.


Full Paper in Color
(PDF, 0.5MB)

On-line Demo


Citation: Yixin Chen, James Z. Wang and Robert Krovetz, ``Content-Based Image Retrieval by Clustering,'' Proc. 5th International Workshop on Multimedia Information Retrieval, in conjunction with ACM Multimedia, pp. 193-200, Berkeley, CA, ACM, November 2003.

Copyright 2003 ACM. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the ACM.

Last Modified: September 7, 2003
2003