CS 545I - Advanced Image and Video Databases, Winter 99
Friday, 8 Jan 1998. First lecture CS545I
Part 1- Introduction
Oscar Firschein, Stanford
1 - Why study image and video databases?
Image and video databases are an increasingly important type of database
as sources of images increase, methods of storage improve, and the Net
offers the communication ability. However, both still images and video
sequences have important unique characteristics. The database designer
must know and understand these characteristics.
2- Goal of seminar
To gain an appreciation of the special problems of image and video retrieval,
to learn about some current systems, to learn about the indexing and database
organization techniques, and to get hands-on experience with one of the
systems.
3 - Some sources of images
Medical imagery (pathology slides, x-ray, NMR, ultrasonic, etc,)
News and entertainment videos
Educational videos
Art and photo collections
Consumer and engineering catalogues
Scientific images (astronomy, earth resources, etc.)
Images collected by intelligence agencies (often sattelite images);
video sequences taken by unmanned vehicles
Home photos/videos
4 - Types of user request
There is a remarkable variety of user requests. Users may want combinations
of the following requests:
SIMILARITY: Find an image that looks like this image (or parts of it
look like part of this image)
OBJECT: Find an image that contains a cat
CONDITION/SITUATION: Find photos of water pollution
SPECIFIC PERSON: Find a video frame of Clinton talking to Rabin
OBJECT RELATIONSHIP: Find an image that contains a cat near a dog
MOOD: Find a sad/happy/... picture
VIEW ANGLE: Find a picture of a crowd taken from an airplane
TIME OF DAY/SEASON: Find a picture of Yosemite taken at day/night/sunset/winter
COLOR: Find a picture with a red apple
TEXTURE: Find picture with a brick texture
SHAPE: Find picture with circular object
GEOGRAPHIC: Find aerial image of the port of San Francisco
5 - Approach to image database design
One could treat an image database as if it were a document database.
Create a database in which each image is manually tagged with an index
term
Queries are combinations of index terms.
User reviews results and modifies query
HOWEVER
This approach does not take advantage of the special aspects of an image:
People can review candidate retrieved images very fast, and can quickly
indicate whether search is converging
An image has many "meanings" depending on the interest or
"point of view" of the viewer
6 - An Ideal image database system:
- allows user to review image "thumbnails"
- presents image in order of "closeness"
- uses a variety of description approaches, many of them automated
- allows retrieval using conventional relational, etc. databases
- allows search refinement
7 - Capabilities needed
- Similarity metric must match the human idea of similarity of images
- Search must be efficient enough to be interactive
- User must be able to specify needs without becoming an image or DB
expert -- a good approach is for user to specify by providing examples
close to what is desired
- Image descriptor-finding must be automated
8 - Problems
How to normalize an image: scale, orientation
Capturing aspects of content by using invariants or discriminants
Can these invariants capture semantic information?
Efficiency of invariants
Can title or caption of image (or audio portion of video) aid in finding
invariants?
Part 2 Human Perception
Dr. Martin A. Fischler, AI Center, SRI International
A person can look at an image and instantly "understand"
it -- the identification of objects in the image, their relationships,
and what story the image is telling.
This part of the lecture will describe the physics and biology of
vision and will then deal with the psychology of vision. By examining biological
vision, we can gain some insight into the difficulties of machine vision.