Gates Building, Room 100, 3:15-4:30
28 Feb, 1997

Database Indexing Techniques

Prof. Gio Wiederhold

Seminar abstract

Objective: Image represented by fewer bits

Compression methods


Effect on TV images:

reduction to 5%, imperceptible quality difference


Future Directions

Indexing Outline

1. Create sets of symbolic descriptors for images

2. Collect descriptors for easy processing

3. To retrieve, create descriptor set for query

4. Match descriptors, typically multiple (partial match)

5. Provide feedback to customer

6. Improve the query if needed, return to 3.

Index features

Index Creation Alternatives

Features for describing images

Segmenting prior to Indexing

Features of segments

coherence permits finer descriptions

each feature creates one (or more) indexes

Storing Segments

(out of context)
(but only a few should be presented anyhow)

Index construction


Unique index has one distinct key value per image (n entries)

Most image indexes are non-unique (c distinct values <<n )

1.distinct from the data (even remote) (limits compression)

2. adjoining the data

3. intermingled (rapid data access)

Attributes for images/segments

Index usage

seems easy, but values are very correlated, so that

either s << n (too many results) or s > n (none).

Index selection

Which indexes to use for

1. match selections reliable, external

take intersections (inputs can be large)

(poorly supported by relational, business-oriented systems)

2. ranking fuzzy, internal

show at least thumbnails, user cannot judge from index values

3. not at all overly restrictive

Combine indexes into a hierarchy? (K,D trees)

say, shape before color

situation, customer dependent

Multiple hierarchies ?

Cost of dealing with indexes

x = no of levels; y = fanout = blocksize/entries (compress!)

n = 106, B= 1000, E=10, y = 100 ð 3 ! fast

Fuzzy queries

nearest neighbor semantically

nearness is a semantic concept,

depends on customer, application

Example `near SFO'


indexing method effectiveness depends on