CS 545I - Advanced Image and Video Databases, W 96/97

Friday, 10 Jan 1996. First lecture CS545I

Prof. Gio Wiederhold, Stanford, Dr. Dragutin Petkovic, IBM Almaden Research Labs, and Oscar Firschein, Stanford

See the new material on how to use the IBM QBIC system online that appears as Item 10.

1 - Why study image and video databases?

Image and video databases are an increasingly important type of database as sources of images increase, methods of storage improve, and the Net offers the communication ability. However, both still images and video sequences have important unique characteristics. The database designer must know and understand these characteristics.

2- Goal of seminar

To gain an appreciation of the special problems of image and video retrieval, to learn about some current systems, to learn about the indexing and database organization techniques, and to get hands-on experience with one of the systems

3 - Some sources of images

  • Medical imagery (pathology slides, x-ray, NMR, ultrasonic, etc,)
  • News and entertainment videos
  • Educational videos
  • Art and photo collections
  • Consumer and engineering catalogues
  • Scientific images (astronomy, earth resources, etc.)
  • Images coolected by intelligence agencies (often sattelite images); video sequences taken by unmanned vehicles
  • Home photos/videos
  • 4 - Types of user request

    There is a remarkable variety of user requests. Users may want combinations of the following requests:

  • SIMILARITY: Find an image that looks like this image (or parts of it look like part of this image)
  • OBJECT: Find an image that contains a cat
  • CONDITION/SITUATION: Find photos of water pollution
  • SPECIFIC PERSON: Find a video frame of Clinton talking to Rabin
  • OBJECT RELATIONSHIP: Find an image that contains a cat near a dog
  • MOOD: Find a sad/happy/... picture
  • VIEW ANGLE: Find a picture of a crowd taken from an airplane
  • TIME OF DAY/SEASON: Find a picture of Yosemite taken at day/night/sunset/winter
  • COLOR: Find a picture with a red apple
  • TEXTURE: Find picture with a brick texture
  • SHAPE: Find picture with circular object
  • GEOGRAPHIC: Find aerial image of the port of San Francisco
  • 5 - Approach to image database design

    One could treat an image database as if it were a document database.
    Create a database in which each image is manually tagged with an index term

  • Queries are combinations of index terms.
  • User reviews results and modifies query
  • HOWEVER
    This approach does not take advantage of the special aspects of an image:

  • People can review candidate retrieved images very fast, and can quickly indicate whether search is converging
  • An image has many "meanings" depending on the interest or "point of view" of the viewer
  • 6 - An Ideal image database system:

    1. allows user to review image "thumbnails"
    2. presents image in order of "closeness"
    3. uses a variety of description approaches, many of them automated
    4. allows retrieval using conventional relational, etc. databases
    5. allows search refinement

    7 - Capabilities needed

    1. Similarity metric must match the human idea of similarity of images
    2. Search must be efficient enough to be interactive
    3. User must be able to specify needs without becoming an image or DB expert -- a good approach is for user to specify by providing examples close to what is desired
    4. Image descriptor-finding must be automated

    8 - Problems

  • How to normalize an image: scale, orientation
  • Capturing aspects of content by using invariants or discriminants
  • Can these invariants capture semantic information?
  • Efficiency of invariants
  • Can title or caption of image (or audio portion of video) aid in finding invariants?
  • 9 - Some existing commercial systems

    IBM WWW QBIC combines QBIC (Query by Image Content) color, shape, texture and traditional DB searching (Jan 24 and Jan 31 presentation)

    Virage, San Mateo, CA, image and video DB retrieval.(Feb 7 and Feb 14 presentation)

    10 QBIC (TM) - Query by Image Content

    * What is QBIC

    The QBIC (Query By Image Content) project at IBM's Almaden Research Center is studying methods to query large on-line image databases using the images' content as the basis of the queries. Examples of the content used include color, texture, shape, size, orientation, and position of image objects and regions. Key issues include derivation and computation of attributes of images and objects that provide useful query functionality, retrieval methods based on similarity as opposed to exact match, query by image example or user drawn image, the user interfaces, query refinement and navigation, high dimensional database indexing, and automatic and semi-automatic database population. We have developed AIX prototype system, a product called Ultimedia Manager (with STL), a set of APIs, plus a WWW demo. We have also developed a separate QBIC search engine for AIX, NT, Linux, Win95 and OS/2 that can be downloaded from the WWW. Our test databases include over 10,000 images from variety of sources. The key applications of this technology are in the areas where image patterns are the basis of the queries, like in retail cataloging, stock photo archives, art, textile manufacturing, and business graphics. QBIC technology has to be integrated with other traditional search technologies like SQL and text search in order to be useful in real applications. Among others, this technology is used by UC Davis Art Library (Prof. B. Holt) to answer queries like "Give me all artists that use brush strokes like Van Gough" which were not possible to answer using standard keyword search methods, and by Prof. J. Hethorn, also from UC Davis, to study trends in fashion.

    Note one key advantage of QBIC for browsing images over the WWW. Say you are looking for images of beach scenes. In our demo you can type BEACH for text search. Say you get thousands of images of beach scenes. It is impractical to browse all of them over the WWW. But, you may be interested in the beach image with some pink tones. You can go to customized search for color percentage option, and select 30% of pink. This will result in all images that have "beach" as their keyword to be sorted at the server by color, and then you can display only the top few. This way you can browse much quicker. Then from the result screen you can click on any image and get others that look like that one.

    Application areas we are looking at are in stock photo, retail/shopping, electronic catalogs and image libraries.

    * How to find QBIC on the WWW:

    wwwqbic.almaden.ibm.com

    The database has about 2000 images. Note that there are a few keywords for each image, this can mean no response if you type a keyword which does not exist, since the DB is limited.

    Note that the keywords are used to select the subset of images, which are then SORTED by similarity to QBIC like query. QBIC like query can be generated either by selecting the colors from customized search or by pointing to an image.

    You can even download 90 day free QBIC software from its home page. This software includes QBIC search engine and WWW templates to get your application up and running in a very short time.

    * Some good demos for QBIC on WWW:

    - Type the word beach in the text field. You get images with such keyword that were entered by image administrator or librarian. This is not QBIC ... yet. Assume you want a beach image with some pink coloration. Why use slow WWW browse when you can use QBIC? It is also unlikely anybody encoded such information in keywords. Even if they tried, people are notoriously poor in consistently describing image "look and feel" using keywords. This is where QBIC shines! Make sure COLOR SIMILARITY is set to COLOR PERCENTAGE. QBIC will sort the images by color similarity so you can only look at the top of the ranked list - this is MUCH faster on the WWW. Go to CUSTOMIZED SEARCH and select pink color about 40%. Run query. Note that the beach scenes are re-sorted now. Click on some other image and see what you get. This is how you get queries of the type "show me more images LIKE this one". Only QBIC can do it. Now, check the image of the beach during the night, with silvery water in the middle. Switch COLOR SIMILARITY to COLOR LAYOUT. Click on the the image of the beach at night, see that you are getting "similar" images.

    - Clear text field. Use COLOR PERCENTAGE for COLOR SIMILARITY, go to CUSTOMIZED SEARCH, select 40% yellow, 40% red. See the images. It does NOT matter where the colors are in XY space. Now, change similarity to COLOR LAYOUT. Click on one image with black at the bottom and yellow at the top. See the difference: all images now have similar color layout.