CS99I Meeting 11 Notes: Search engine performance

By Gio Wiederhold, 19 Jan 2000.

Topics Covered briefly

How to write

On the web stuff might be read by a wide variety of people, who all want to benefit from the effort. In traditional publishing the publishetr may provisde an editor to help you. If you publish on the web, and don't have an editor, be extra careful. To help the readers and customers

  1. Introduce the specific topic early
  2. Identify yourself and give the date of writing - stuff will stick around
  3. Identify -- politely -- the expected level of the audience
  4. Use words appropriate to that level
  5. write `Gender neutral' -- use plurals or repeat nouns and adjectives

Browsers

(Continued from Notes 10

Measures of performance

The customer's objective when searching is

  1. Finding all relevant information: Perfect Recall, as long as it does not overload them.
  2. not receiving any irrelevant stuff: High precision

These two objectives conflict with each other.
Perfect recall can be obtained by giving you all of the web!
But now precision is minimal. A browser should use methods that maximize precision (by ranking results).

The information density decreases towards the right. The best methods are on the left.

Type 1 errors measure material not retrieved that should have been
Type 2 errors measure material retrieved that should not have been

Note that in the web the amount of material wanted (the thin box on the left), is infinitesimal versus the stuff that is available.