CHCIS draft

Customer Models for Effective Presentation of Information (draft)

Gio Wiederhold
Stanford University
3 Nov 1996

Topic area: Intelligent information processing and agents

Problem Statement

To deal with the flood of information that is becoming accessible to the growing population of computer-literati it is not adequate to have systems that provide a superficially friendly presentation, we need an underlying structure that is natural to the task being undertaken. As tasks I distinguish both the cognitive aspect, as browsing, problem solving, problem definition, classification, authoring, etc., as well the domain aspect, say finance, health concerns, entertainment, travel, information management, genomics, engineering design, etc.

Related to all these foci and topics is a wealth of information, which can only be effectively managed by imposing structure and value assessment of the information objects. The objective of providing mechanical aids towards this goal seems daunting, but appears a requirement to bring the end-goal of initiatives as the Digital Library, the World-wide-web (in so far it has a goal), much Artificial Intelligence Research, and many computational decision-aids into a form that will be beneficial to the human user. I will refer to the overreaching aspects of this efforts with respect to Human Centered Intelligent Systems as Human-Centered Information Services (HCIS).

Structuring the Setting

Before we can discuss details of approaches to deep human-centered information services we must structure, and hence simplify the task at hand. This initial task, common to most of our productive activities, is exactly the type of task that should be aided by HCIS, and we can introspect to gain an understanding of what services might be helpful in this domain task, defining the problem of information management.

First of all we model the human as an individual engaged in a certain task type. A human can engage in many types of tasks, but it is likely that a human is productive if engaged in a specific task for some time. Tasks are not necessarily carried out to completion before a task switch occurs (notwithstanding advice your parents gave you), but some observable progress is desired. Her we note already a mechanizable service component, recording where one starts and where one left off, so that on returning to the task one can proceed, or rollback, as wanted.

I employ the term customer for a human-engaged in a task. A customer model is hence simpler than a general user model, which must recognize the interplay of many tasks and domains. Simplification is of course a prime engineering concept: only simple things work as expected, and sophisticated tools and models are more likely a hindrance than a benefit.

The next simplification is to assume that a customer model is hierarchical. A giant and far-reaching assumption, and I am sure that exceptions can be found. I will deflect criticism by a tautology: if the customer model cannot be hierarchically represented then the human must be engaged in more than task. Once the hierarchy is accepted we have a wealth of tools available. Most applicable work in decision analysis, in utility theory, in planning, and scheduling becomes of bounded complexity if the structure is hierarchical. Furthermore, within a hierarchy we can often impose a closed-world assumption, so that negation becomes a permissible operator in processing. Such assumptions are often made implicitly, for instance all of Prolog's inferencing depends on negation-by-failure. The customer model makes the assumption explicit.

Domains

Domain specialization introduces a further simplification. Within a domain any term should have only one semantic meaning, acceptable to all customers working in that domain. A term as `nail' is defined distinctly in any of several domains, as in anatomy and hardware. We again use a tautology to make the condition true: if there are inconsistent interpretations of a term, then we are dealing with multiple domains.

By keeping domains coherent and hence of modest size we avoid many common semantic problems. We have many instances where effective ontologies have been created by specialists focusing on a narrow domain, and failures and high costs when such ontologies were expanded in scope. Establishing committees to solve ontological problem over multiple domains (using our definition) is likely to lead to unhappiness of customers and specialists, to whom a terminological compromise is of little benefit.

Examples of the domain scaling issue in computing is seen in object technology. Soimple objects are attractive, because they can represent data and process constellations in what appears to be a `natural' way. It is no coincidence that their internal structure is typically hierarchical. Inheritance of features in a hierarchical structure of multiple objects provides an effective conceptual simplification for their customers. When object information over multiple domains is integrated, so that multiple inheritance has to be modeled, confusion ensues. Similarly, objects become unwieldy when large and serving multiple tasks. Many of the committees convened to design the `right' objects in industry and government are making glacial progress and their work is likely to be ignored.

Partitioning and Composition

Now that we have structured the world supporting human information services into coherent units, we need tools to extract those units out of the real world and compose he units to serve, first of all, specific tasks and domains, and secondarily, to manage information where multiple tasks and domains intersect. We model extraction by extracting hierarchical submodels out of the world of information resources. We have small scale examples today where computational object models are defined over arbitrary database schemas, and corresponding object instances are created out of the contents of the corresponding relational databases. Web-tools as Yahoo impose a largely hierarchical high-level structure onto much of the information stored in the world-wide-web, and is a productive tool when the hierarchy presented matches a customer model.

Searching through a hierarchy is of logarithmic cost, and acceptable to most customers. Success depends of course on having the instances properly composed and linked into the task hierarchy. Items at the same level in a hierarchy should be ordered into a priority-by-utility list that is again dependent on the customer and domain model. For instance, air-flight fares and arrival times have different utilities for vacation versus business travel.

If the hierarchy is not satisfactory then broader access tools, as Alta-Vista on the web, may be used. These impose a higher cost to the human, who must now impose ones' own hierarchy if the list exceeds, say, 7 plus/minus 2 items.

Here is a first goal for research in HCIS is a clarification of these task models, and the development of tools to make the human into a productive customer. For any hierarchy it should be possible to structure the domain-relevant units located by a search into an effective and natural structure for the customer. At the same time, task and domain switching must be recognized, while prior task models must be retained to be re-enabled if the human returns to a past customer model.

Once we have clear domain and task models we need to be able to follow human complexities, and develop means not only to switch, but to recognize intersections. A new domain being entered is likely related to a prior domain. It would be unwise to keep all of the prior context available, but recently active subsets of the domain are likely to have articulation points that need to be recognized. In our research we visualize an algebra over ontologies, to manage domain intersections needed for complex, multi-domain tasks. Other approaches are likely to be at least as valid, and it is in this arena that a second major research task for HCIS is likely to be found.

How the models that allow effective human processing and services link to the human-computer interface requires matching of the deep semantic structures of HCIS to representations that exploit human cognition effectively. Here is the third area for HCIS research we recognize. Here aggregation and visual representations are likely to be crucial, with easy linkages for expansion down the hierarchy, aggregation up the hierarchy, and context switching among customer models and domains.

Conclusion:

Information must be of a value that is greater than the human cost of obtaining and managing it. More is hence not better, less, but relevant information is best.

To achieve the desired goal research and experiments in providing Human- Centered Information Services are needed. We indicated three topics:

Task models and tools that exploit these models in order to bridge the gap from a human effort to simple, clear, and processable underlying structures.
Tools for task-switching and domain switching and intersections so that the simple task models become composable into practical scope.
Clear mapping of the explicit models and their results into cognitively effective representations.

And then, while we're at it, we should also have fun.

Acknowledgment.

This note depends on results obtained by many researchers. In a proper paper the list of references would be greater than this note itself. I do wish to thank observations made by many others, including (in alpahabetical order) Jean- Raymond Abrial, Ygal Arens, Avron Barr, Dines Bjorner, Barry Boehm, Michael Brodie, Mike Genesereth, Ed Feigenbaum, Michael Kuhn, Joshua Lederberg, Doug Lenat, Vaughn Pratt, Ray Reiter, Paul Saffo, Jeffrey Ullman, and all my students. People listed here may be surprised at finding themselves in each others company, and I certainly don't agree with everything they have said or written, but they all have contributed significantly to the issues. Finally I would like to salute Larry Rosenberg, who supported so strongly the human role in many of the discussions leading to NSF's Digital Library Initiative, without being able to see it to fruition.