CS 545I - Advanced Image Databases, W 95/96

Friday, 19 Jan 1996. Second lecture CS545I

1 Part 1 Oscar Firschein, Visiting Scholar

2 The approach to indexing often depends on the nature of the image database:

  • General still image and video
  • Specialized, e.g., faces, parts catalog, maintenance manual

    3 - Ideally, we would like to be able to automatically index general images on the basis of meaning or content

    If we could automatically obtain a formal description of an image, we could extract index terms from the description.

    4 - Unfortunately, automatic content analysis of imagery is extremely difficult

  • IMAGE COMPLEXITY: A picture is difficult to describe automatically because objects may be complex and occluded by other objects.
  • SHADOWS, REFLECTIONS, TEXTURES: These can confuse automated systems.
  • KNOWLEDGE: Knowledge about the subject being pictured is often required to interpret an image.
  • ILL-POSED PROBLEM: The problem of deducing 3-D objects from a single 2-D image is ill-posed (there are many solutions possible for a given image).
  • PARALLEL VS. SERIAL PROCESSING: A person normally sees an image "all at once." If the person is restricted to viewing the image through a small window that is moved around over the image, the person's ability to understand the image is reduced significantly because the adjacent context is very important in interpreting an image.
    Unfortunately, the computer interprets an image in this limited way. The image is stored as an array of numbers, each representing the grey level or color at a picture location. This array is explored, often by window operations and the results of these window operations must be pieced together to make some sense out of the array.

    5- What are the basic automated approaches used to make sense of an unknown image?

  • Delineate meaningful regions. The segmentation of an image is the division of the image into fragments, or segments, each of which is homogeneous in the same sense. An attempt is made to merge and join homogeneous regions to delineate objects.
  • Find edges of objects: An edge in an image is an image contour across which the brightness of the image changes abruptly. It can indicate a change in object surface or a depth discontinuity. The trick is to aggregate pieces of edge so as to construct meaningful objects.
  • Analysis of shading: Determine 3-D shape from shading and texture
  • Combinations of the above

    6 Recognition of objects

    Once objects have been delineated in the image, there is still the problem of identifying the objects. Object recognition is a difficult problem requiring that the delineated objects be matched against a database of object models. This is practical when the number of possible objects is small, but is less effective as the number of possible objects becomes large. The bottom line is that the current state of the art in automatic description of arbitrary images is very limited.

    7 - Current automatic indexing approaches for general images usually depend on gross measures of the image: color, texture, and simple shape.

  • The next several lectures will go into the details of such indexing.
  • We will see that clever system design, particularly in the user interface can overcome many of the limitations of these "non-semantic" approaches.

    8 - Dealing with specialized image collections

    For a specialized collection of images, if the characteristics of the are known, then automatic indexing may be possible
  • For a database of faces, we can obtain a description of the typical face, and can describe each image by its differences from the typical object. These differences then form the descriptor vector for the image. (See Pentland reference "Photobook..", p6. Appearance Photobook" for discussion of eigenimage representations).
  • In retrieval of intelligence imagery, a stored 3-D model, a "site model" can be used as an indexing mechanism, and the stored images can be registered to buildings at the site. Then one can retrieve images that show the roof of a specific building on the site.
  • For a parts catalog, one may be able to describe the shape of parts in a canonical way that can be used as an index.
  • For a database of drawings that are labeled with parts annotations, one may be able to automatically pick off part names or i.d. labels for use in an index

    9 - In summary, general image databases are indexed using color, texture, and simple shape. Specialized image databases use indexing techniques designed to take advantage of the known characteristics of the collection.

    10 Part 2, Dragutin Petkovic, Manager, Advanced Algorithms, Architectures, and Applications

    11 BASIC ISSUES IN CONTENT BASED RETRIEVAL

  • Image representation
  • Matching of image descriptors with query descriptors
  • Integration with traditional search and DB (SQL, text)
  • User Interface
  • Performance measures (retrieval accuracy, storage, speed)
  • Network and WWW and other systems issues
  • Data capture, annotation and indexing
  • Applications
  • Extends to video

    12 Image representation

  • Non-image data (places, prices, author etc.) - mostly keyed in
  • Image descriptors: easy to compute, applicable to a variety of planned and unplanned queries
  • Image descriptors: color, texture, shape, layout, relationships
  • Image descriptors can refer to the whole image or to image objects
  • Image objects can be obtained manually (outlining), automatically (segmentation, object reco) or semiautomatically
  • Offer SEMANTIC compression (Pentland)
  • No need to solve full object reco problem (although desirable)

    13 Matching

  • Match one query with very large number of samples in the DB (unlike in CV: one image with 10s of models). Retrieve samples not a "statement" like good/bad.
  • Needs to be fast, indexable, and to correspond to human perception and expectations
  • Examples: normalized quadratic functions, nearest neighbors, neural networks.
  • Issues of color spaces for matching vs. RGB

    14 Integration with traditional DB and search methods

  • Non-image descriptors (keywords, free text, numerics) searched by text search, SQL. Note the difference between keyword searches, and SQL on text/numeric data
  • SQL produces a SET, content retrieval RANKS the set (does not make a "cut")
  • Systems aspects: How to integrate with DB in an extensible way? Note IBMs DB2 Extenders, and Illustra's blades.
  • How to merge ranked lists?

    15 User Interface

  • Needs to combine browse, search, navigation and relevance feedback
  • Needs to address wide variety of users, mostly non-technical
  • "Visual" users search differently than others
  • Constant interplay between narrowing and broadening the search
  • Fast
  • .User should not get lost; system should be intuitive with not too many controls
  • Hard to communicate a variety of ranked results, especially if ranks are combined
  • WWW issues

    16 Performance Measures

  • Retrieval accuracy: normalized recall and precision (Salton)
  • Speed (browser, network, indexing, storage, fast BLOB support)
  • Cost: storage, data capture and indexing costs

    17 Network and WWW issues

  • Networks are slow, and we need fast browse for image data
  • The slower the browse, the more important content based retrieval
  • Interactivity on WWW
  • Needs to stage the search (do the fastest one first)
  • Access control

    18 Data Capture and annotation/indexing

  • Often overlooked, but key to success. Very expensive to digitize and have people key in the data for 10000000s of images
  • Data capture: digitization, color accuracy, sizing, cropping etc.
  • Data annotation indexing: entering the keywords and data; obtaining related data from other sources (other DB, cameras etc.); preprocessing to extract content descriptors, outlining for objects
  • Issues of people's inconsistency, costs of training
  • Tightly controlled processes for data input
  • Meta data (indices) might become more valuable than BLOBS

    19 Extensions to Video

  • Video also needs to be searched for by keywords and data
  • Most of the image database issues apply here, with important additions:
  • Browsing video is even more time consuming due to its size
  • Break video into scenes, represent each scene with data and the keyframe. Or create salient stills to represent the video
  • Use image content descriptors on the keyframe
  • Video-specific content descriptors like motion (camera, object) etc. objects
  • Systems issues even more complex: size of data, QOS etc.