CS 545I - Advanced Image Databases, W 95/96
Friday, 19 Jan 1996. Second lecture CS545I
1 Part 1 Oscar Firschein, Visiting Scholar
2 The approach to indexing often depends on the nature of the image
database:
General still image and video
Specialized, e.g., faces, parts catalog, maintenance manual
3 - Ideally, we would like to be able to automatically index general
images on the basis of meaning or content
If we could automatically obtain a formal description of an image, we could
extract index terms from the description.
4 - Unfortunately, automatic content analysis of imagery is extremely
difficult
IMAGE COMPLEXITY: A picture is difficult to describe automatically because
objects may be complex and occluded by other objects.
SHADOWS, REFLECTIONS, TEXTURES: These can confuse automated systems.
KNOWLEDGE: Knowledge about the subject being pictured is often required to
interpret an image.
ILL-POSED PROBLEM: The problem of deducing 3-D objects from a single 2-D
image is ill-posed (there are many solutions possible for a given image).
PARALLEL VS. SERIAL PROCESSING: A person normally sees an image "all at
once."
If the person is restricted to viewing the image through a small window that
is moved around over the image, the person's ability to understand the image
is reduced significantly because the adjacent context is very important in
interpreting an image.
Unfortunately, the computer interprets an image in this limited way.
The image is stored as an array of numbers, each representing the grey level
or color at a picture location. This array is explored, often by window
operations and the results of these window operations must be pieced together
to make some sense out of the array.
5- What are the basic automated approaches used to make sense of an
unknown image?
Delineate meaningful regions. The segmentation of an image is the division
of the image into fragments, or segments, each of which is homogeneous in the
same sense.
An attempt is made to merge and join homogeneous regions to delineate objects.
Find edges of objects: An edge in an image is an image contour across
which the brightness of the image changes abruptly. It can indicate a change
in object surface or a depth discontinuity. The trick is to aggregate pieces
of edge so as to construct meaningful objects.
Analysis of shading: Determine 3-D shape from shading and texture
Combinations of the above
6 Recognition of objects
Once objects have been delineated in the image, there is still the problem of
identifying the objects. Object recognition is a difficult problem requiring
that the delineated objects be matched against a database of object models.
This is practical when the number of possible objects is small, but is less
effective as the number of possible objects becomes large.
The bottom line is that the current state of the art in automatic description
of arbitrary images is very limited.
7 - Current automatic indexing approaches for general images usually
depend on gross measures of the image: color, texture, and simple shape.
The next several lectures will go into the details of such indexing.
We will see that clever system design, particularly in the user interface
can overcome many of the limitations of these "non-semantic" approaches.
8 - Dealing with specialized image collections
For a specialized collection of images, if the characteristics of the are
known, then automatic indexing may be possible
For a database of faces, we can obtain a description of the typical face,
and can describe each image by its differences from the typical object. These
differences then form the descriptor vector for the image. (See Pentland
reference "Photobook..", p6. Appearance Photobook" for discussion of
eigenimage representations).
In retrieval of intelligence imagery, a stored 3-D model, a "site model"
can be used as an indexing mechanism, and the stored images can be registered
to buildings at the site. Then one can retrieve images that show the roof of a
specific building on the site.
For a parts catalog, one may be able to describe the shape of parts in a
canonical way that can be used as an index.
For a database of drawings that are labeled with parts annotations, one
may be able to automatically pick off part names or i.d. labels for use in an
index
9 - In summary, general image databases are indexed using color, texture,
and simple shape. Specialized image databases use indexing techniques
designed to take advantage of the known characteristics of the
collection.
10 Part 2, Dragutin Petkovic, Manager, Advanced Algorithms,
Architectures, and Applications
11 BASIC ISSUES IN CONTENT BASED RETRIEVAL
Image representation
Matching of image descriptors with query descriptors
Integration with traditional search and DB (SQL, text)
User Interface
Performance measures (retrieval accuracy, storage, speed)
Network and WWW and other systems issues
Data capture, annotation and indexing
Applications
Extends to video
12 Image representation
Non-image data (places, prices, author etc.) - mostly keyed in
Image descriptors: easy to compute, applicable to a variety of
planned and unplanned queries
Image descriptors: color, texture, shape, layout, relationships
Image descriptors can refer to the whole image or to image objects
Image objects can be obtained manually (outlining), automatically
(segmentation, object reco) or semiautomatically
Offer SEMANTIC compression (Pentland)
No need to solve full object reco problem (although desirable)
13 Matching
Match one query with very large number of samples in the DB
(unlike in CV: one image with 10s of models). Retrieve samples
not a "statement" like good/bad.
Needs to be fast, indexable, and to correspond to human perception
and expectations
Examples: normalized quadratic functions, nearest neighbors,
neural networks.
Issues of color spaces for matching vs. RGB
14 Integration with traditional DB and search methods
Non-image descriptors (keywords, free text, numerics) searched
by text search, SQL. Note the difference between keyword searches,
and SQL on text/numeric data
SQL produces a SET, content retrieval RANKS the set (does not
make a "cut")
Systems aspects: How to integrate with DB in an extensible way?
Note IBMs DB2 Extenders, and Illustra's blades.
How to merge ranked lists?
15 User Interface
Needs to combine browse, search, navigation and relevance feedback
Needs to address wide variety of users, mostly non-technical
"Visual" users search differently than others
Constant interplay between narrowing and broadening the search
Fast
.User should not get lost; system should be intuitive with not too many
controls
Hard to communicate a variety of ranked results, especially if
ranks are combined
WWW issues
16 Performance Measures
Retrieval accuracy: normalized recall and precision (Salton)
Speed (browser, network, indexing, storage, fast BLOB support)
Cost: storage, data capture and indexing costs
17 Network and WWW issues
Networks are slow, and we need fast browse for image data
The slower the browse, the more important content based retrieval
Interactivity on WWW
Needs to stage the search (do the fastest one first)
Access control
18 Data Capture and annotation/indexing
Often overlooked, but key to success. Very expensive to digitize
and have people key in the data for 10000000s of images
Data capture: digitization, color accuracy, sizing, cropping etc.
Data annotation indexing: entering the keywords and data;
obtaining related data from other sources (other DB, cameras etc.);
preprocessing to extract content descriptors, outlining for
objects
Issues of people's inconsistency, costs of training
Tightly controlled processes for data input
Meta data (indices) might become more valuable than BLOBS
19 Extensions to Video
Video also needs to be searched for by keywords and data
Most of the image database issues apply here, with important
additions:
Browsing video is even more time consuming due to its size
Break video into scenes, represent each scene with data and the
keyframe. Or create salient stills to represent the video
Use image content descriptors on the keyframe
Video-specific content descriptors like motion (camera, object) etc.
objects
Systems issues even more complex: size of data, QOS etc.