Automatic Linguistic Indexing of Pictures
By a Statistical Modeling Approach
Jia Li, James Z. Wang
The Pennsylvania State University, University Park, PA 16802
Abstract:
Automatic linguistic indexing of pictures is an important but highly
challenging problem for researchers in computer vision and
content-based image retrieval. In this paper, we introduce a
statistical modeling approach to this problem. Categorized images are
used to train a dictionary of hundreds of statistical models each
representing a concept. Images of any given concept are regarded as
instances of a stochastic process that characterizes the concept. To
measure the extent of association between an image and the textual
description of a concept, the likelihood of the occurrence of the
image based on the characterizing stochastic process is computed. A
high likelihood indicates a strong association. In our experimental
implementation, we focus on a particular group of stochastic
processes, that is, the two-dimensional multiresolution hidden Markov
models (2-D MHMMs). We implemented and tested our ALIP (Automatic
Linguistic Indexing of Pictures) system on a photographic image
database of 600 different concepts, each with about 40 training
images. The system is evaluated quantitatively using more than 4,600
images outside the training database and compared with a random
annotation scheme. Experiments have demonstrated the good accuracy of
the system and its high potential in linguistic indexing of
photographic images.
Full Paper in Color
(PDF, 3.3MB)
On-line Info
Citation:
Jia Li and James Z. Wang, ``Automatic Linguistic Indexing of Pictures
by a Statistical Modeling Approach,'' IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1075-1088,
2003.
Copyright 2003 IEEE.
Personal use of this
material is permitted. However, permission to reprint/republish this
material for advertising or promotional purposes or for creating new
collective works for resale or redistribution to servers or lists, or
to reuse any copyrighted component of this work in other works, must
be obtained from the IEEE.
Last Modified:
November 12, 2003
© 2003