Between the minds that plan and the hands that build there must be a mediator [Brigitte Helm inMetropolis, a silent movie by Fritz Lang, 1926]
On the information highways, we are encountering diversity of roads and vehicles as well. The diversity becomes a greater concern as we reach out beyond our homes and workplaces. While once computers operated in stand-alone mode, or used simple connections to each other, the information highways we are contemplating will have many thousands of participating computers, and even more interchange points. While the notion of having similar computers, and similar data and information structures everywhere would make life simple, such a coherence is clearly not feasible, just as we could not have a single mode of transport nor a single package for all the goods to be shipped. Progress also requires change, and incremental changes also create inconsistencies. While Henry Ford might have been content if we all stayed with the Model T, over time most Model T's were replaced by faster, bigger, and more colorful vehicles.
The software equivalent of interchange hubs are called mediators. Mediation is an integrating concept, combining a number of current technologies to find and transform data, and making them available in hubs along the information highways. Mediation recognizes the autonomy and diversity of the data systems, information services, and user applications. Their autonomy enables the overall system to grow, since new sources, new means of transport, and novel information processes can be inserted. Incremental growth only requires that a few mediating hubs be adapted to link the new facilties into the traffic network. As the new facilities become more popular further mediators will adapt to take advantage of them and the business they represent. Those users that need the new sources will use the adapted mediators, users that don't care remain unaffected.
Just as hubs for transport have become specialized to deal with people, letters, goods, foodstuff as fruit, fish, beef, and the like, mediators will also be specialized. Specailization makes maintenance feasible, an expert can focus on one's own domain, without having to consider the different constraints imposed by handling fish versus bicycles.
At a hub one often fnds hotels for people and warehouses for goods. These are necessary because the the variety of transport mechanisms can not be perfectly synchronized. The equivalent function in the digital systems is intermediate storage or cachingof data. The technology for storage of data was discussed, but the function differs here. In a database a long-term, coherent collection of information must be stored and maintained. In a mediator only copies of information to be forwarded needs to be stored. Such data must be well enough identified with name, type, source, date-of-validity, so that it can be merged with related data. If data has to be kept a long time before a match can occur it may be wise just to keep a reference in the mediator, and acquire it from the source database when required. Sometimes much processing is needed to merge data, so that specialized processes may be associated with a mediator, just as transportation hubs become rapidly surrounded by specialized factories.
Mediation is achieved by software. A mediator transforms data available on the network, to make it more suitable and relevant to the consumer. This software function can be carried out on dedicated computers on the network, or can be assigned to computers performing other tasks as well. Since software is easy to copy over the same highways used to transport the goods many copies can be rapidly installed. The ease with which software based factories can move implies a much more flexible configuration of the network than seen in traditional manufacturing. A traditional supplier of added value can be rapidly replaced.
Mediation adds value to the data. Mediators are created by experts and need to be maintained, so that they can remain effective in a constantly changing world. The user of a mediator should pay for such services, and creator and maintainer receive payments. Growth if the information highways is enabled by mediation among autonomous modules, and growth of mediation requires the availability of payment mechanisms.
Figure:
Clinic Example: Two domain-specific mediators supporting one customer..
Hence Mediators must first of all reduce the volume being presented, by selecting relevant data from a variety of sources and summarizing it bring it the level needed for the users' application. When multiple data sources contribute data the data must be combined or fused. Reliable fusion requires that data match in terms of level and scope. To reduce the cost of repetitive access a mediator may store its internal results for some time in a cache. The sources may be updated at differing times, and having a cache also provides a means to synchronize data, so that they will be temporally consistent. However, dealing with consistent, but obsolete data will nor satisfy users involved in planning tasks, often requiring simulation. Simulation can extrapolate information so it becomes current, or extends into the future. generating new, but less definite information. These five functions will now be described in detail.
Figure:
Mediator Tasks: generating information..
<<>>
A mediator node will manage access and processing of a number of concepts within one domain. !Extract a hierarchy!.
Figure:
Extract a tree: Finding and structuring data from the web..
!incorporate in Ontologies?!
For each domain we define an ontology: a vocabulary, and a classification scheme which links the terms used, as shown in Fig.\vocabulary !somewhere!. When a term is used outside of its context, it is labeled: Carpentry.miter versus Religion.miter. Within a domain the terms should be wholly unambiguous, both in definition and scope. Considering the differences between employees in Personnel and on the Payroll from the example in Sect.!\?\?\?!, we need to distinguish those two domains. We show a simple example of two domains, and note that outside of the domain we should label the terms Shoesales.model, Shoesales.supplier, etc., and Shoemaker.company, Shoemaker.model, etc. .
Figure:
Domains: terminologies in a shoe store, a shoe factory, and in purchasing shoes..
Keeping all terms disjoint disables the domain interoperation we seek. We have to develop a set of operations that permits us to match and merge ontologies: an algebra over ontologies. Only a few operations are needed, namely Intersection (Ç), Union (È), and Difference (-). We also need operations to help us within the domains, so that local domain terms can be mapped to meanings that are globally acceptable, we call this operation Map (M).
Intersection(Ç) creates a
list of those terms that match in meaning. For instance,
given the two domains, we can define the intersection to match
Shoesales.model with Shoemaker.model, and
Shoesales.supplier with
Shoemaker.company. Note that these are knowledge-based operations,
they require
that somewhere the knowledge exists to permit the computation of the intersection of the
source ontologies. This need for knowledge establishes the reason for having mediators as
distinct modules, since this knowlegde has to be captured and maintained somewhere. It
cannot be part of Shoesales, nor of Shoemaker alone. Within the
mediator we create
a new ontology for this intersection, let's call it Shoes. then
Shoes.model = Ç(Shoesales.model,
Shoemaker.model)
and, creating a new term in the Shoes ontology
Shoes.maker = Ç(Shoesales.supplier,
Shoemaker.company).
Terms as color and tint, and orders and
backlog can also be matched.
Not all terms will be defined in the intersection. Local terms, as Shoemaker.man-hours
will not be matched and not appear in the intersection shown.
Union(È) creates
a complete list. For those terms that have a defined intersection the
result is given, the remainder are copied as they are. This operation does not create a
consistent new ontology, since now similar prime terms continue to have their old domain
association. For example
AllShoes.model = È(Shoesales.model,
Shoemaker.model)
AllShoes.Shoesales.price = È{Shoesales.price}
AllShoes.laborhours =È(Shoemaker.laborhours)
In practice the ontogical algebra operates on sets, with results being new sets.
Difference (-) permits removal of terms.
It's main use is to let us determine which
terms are unmatched in sets of terms, i.e., local, as
OnlyShoesales.
Map (M)permits computation within an ontology,
so that more terms become
candidates for matching and for the creation of useful intersections. For instance, there is a
relationship between the Shoesales.price and the Shoesales.profit which
requires the
Shoemaker.price. We can define Shoesales.cost = M(Shoesales.price, Shoesales.profit).
We also need mappings to determine the size and width& of the
shoes from the Shoemaker.last as
Shoemaker.size = M(Shoemaker.last); Shoemaker.width =
M(Shoemaker.last).;
These mappings are part of the knowledge incorporated in the mediator.
The crucial ontology in a mediator is created through intersection of source ontologies.
Over the intersection we can compute new aggregations and enhance the information
content of the data produced through mediation. Having an algebra permits composition
over ontologies. We can expect that useful new domains can be created by taking the
union of two or more intersections.
By having an algebra on domains that permits creating explicit linkages of joint
terminology we also suppress the temptation of creating larger, but less precise ontologies.
A base ontology should not be so large to preclude agreement among the people using it in
a modest time. If it takes too long, some terms will already have changed in meaning.
Support of change is actually a major motivation for formal management of ontologies.
Unless the current state and extent of mismatch can be well defined, we will not be able to
note the improvements being made as people, by interacting via these information
structures, achieve more consensus.
The description in this section ignores a major aspect of operating on ontologies. Early
work in that direction is represented by algebras over objects [Barsalou::xx]
A major effort to define an ontology is MEDIATORS.Alternatives
The functions carried in Mediation are not new, they have been needed for a long time, but were
not recognized as being distinct.
Alternatives included designing large, integrated systems, often with
the help of consultants, since
any single organization rarely had the breadth to deal with multiple domains. When the sources and the
applications remain distinct, but connect directly, we speak of Client-
server systems. When the roles are less well-defined we have open
systems.
The second, related difference is that the consultant stays being responsible for the
mediator. If the former consultant, now the mediator expert, has not formulated the
mediating program right, then it the experts responsibility to fix it. And, since mediation
must deal with changing environments, any further changes are also the expert's
responsibility. The payment mechanism is also different. In mediation, we expect that
small fees will be extracted with every successful use. That changes the motivation for the
consultant. A quality product will generate ongoing income. In today's mode the
consultant gets a new fee when things go wrong.
Gio Wiederhold was born in Italy, during the initial actions of World-War II, and spent the
remains of the war time in European countries, while his family was trying to stay out of
trouble and danger, and managed to avoid going to school during that time. After the
situation settled down, he was shipped to Holland and attended the Grotius Lyceum in The
Hague, the Technicum in Rotterdam and the Technical University in Delft, studying
aeronautical engineering. !Lab.Elec.Music! During summers he worked selling ice-cream,
repairing refrigeration machinery, and shipping out on a Dutch merchant vessel. A
summer job in 1957, at the NATO Air Defense Technical Center in Wassenar, Holland,
led to him being introduced to computing, first on calculators and then on very primitive
computers.
In 1958 he emigrated to the United States, first working for IBM and later as Chief
Programmer for the University of California. After teaching one year at the Indian
Institute of Technology, he become Director of the Advanced Computer for Medical
Experiments (ACME) at Stanford and a lecturer in its recently established Computer
Science Department. For the Medical School he developed a time-shared real-time data-
acquisition system (ACME). It included a large on-line filing system, and with it he
established the Time-Oriented Database system (TOD) for the masses of data being
collected. TOD eventually provided nation-wide services for immunolgy patients.
In 1969 he set up Index corporation, developing information retrieval systems for real-
estate and performing artists.
In 1974 he enrolled in the new PhD program in Medical Information Science at the
University of California in San Francisco. His thesis, completed in 1976, was on
[structured Design of Medical Databases]. During his studies he wrote an early textbook
"Database Design' [Wiederhold:77,83] which provided a quantitative approach to the topic.
After joining Stanford's Computer Science faculty in 1976, he proposed research
combining AI and databases to ARPA, coining the term Knowledge-bases for this
combination. Research in that direction led to a number of topics presented in this work,
especially the concept of mediation discussed in this chapter. He has remainded at Stanford
except for a sabbatical with IBM Germany in 1987, consulting on knowledge-base support
for LInguistic LOGic systems (LILOG), and an assignment to ARPA from
1991 to 1994. At ARPA he was Program Manager for Knowledge-based Systems, and
had opportunity to interact with colleagues involved in the HPCC program and the NII.
Gio Wiederhold has been recognized by a number of communities. He is a fellow of
the (ACMI), the (IEEE), and the (ACM), has received the ACM
SIGMOD Contributions award.
He was a member of the board of the NSF-established National Center for Geographic
Information and Analysis (NCGIA), and much of the material presented in Chapter \P was
obtained in their meetings. He has consulted for local companies, for multinational
enterprises, for government, and for international organizations, such as the United Nations
on information systems for India and China.
A mediator often creates a derived work from material protected by
copyright.
This means that the supplier of source data, if such source data
was copyrighted, must be reimbursed for every use. Such a reimbursement must come
from the owner of the mediator. If use of the mediator is charged, then a portion of the
charges can be allocated to the provider of the source material, otherwise such charges
must be from the owner's budget.
List of all
Chapters.
Figure:
Domains: Operations in an ontology algebra.MEDIATORS.Alternatives.consultants
Mediator modules encode the knowledge that is commonly
provided by a consultant, but there are two crucial differences. The representation of the
knowledge in executable form is not provided by the consultant. Often the application
programmer integrates what was learned into the application programs. When these
programs actually enter use the consultant may be long gone, or collects another fee for
fixing any misunderstandings. In a mediator the provider of the knowledge also provides
its representation. That is in concert with
Figure: Consultants: taking an active role..MEDIATORS.Alternatives.source-integration
MEDIATORS.Alternatives.clients-erver
MEDIATORS.Alternatives.open-systems
MEDIATORS.Bio
Gio Wiederhold: It is immodest to put one's own biography into a book.
Normally the author provides to the publisher a biography for the dustcover, so that its role
is to keep the book clean. But the topics covered in this book will undergo rapid change,
so that keeping it clean should not be a worry. MEDIATORS.Conclusion
Income for mediating services !...!.per-use fee, lease, purchase.
maintenance and update costs.MEDIATORS.Conclusion.science
The concept of mediation combines elements from at least three disciplines: Databases,
System analysis, and Artificial Intelligence. There is not yet a specific discipline which
deals with the delivery of information at high and integrated level. Concepts such as mediation are intended
to form a basis for a new discipline: Integration Science.
Figure: Integration Science: sources of the discipline..MEDIATORS.Lists
Companies supplying Mediators
This information is based on a survey for Data Base Processing and
Design, 1998.
name location product services [ref]
Epistemics Palo Alto CA Infomaster
scheduling, resource management [www.epistemics.com] |
Global Infotek Vienna VA systems engineering
integration [www.globalinfotek.com] |
iBrain Inc Palo Alto CA decision support software
financial services [www.ibrain.com] |
I-Kinetics inc Cambridge MA Databroker
infrastructure software [www.i-kinetics.com] |
ISI Marina Del Ray CA SIMS
mediator research and development [www.isi.com] |
ISX Westlake Village CA design and implementation
intelligence systems, planning and logistics [www.isx.com] |
Junglee Sunnyvale CA Junglee << >>
Internet job placement, shopping [www.junglee.com] | << >>
K2 Informatics Bryn Mawr, PA Kleisli
genomic information [71072.234@compuserve.com] |
Lockheed-Martin Idaho Technologies Idaho Falls ERIS
environmental, chemical data [http://www.ineel.gov] |
Lockheed Martin Management&Data Systems King of Prussia, PA system design
Government, education [www.lmco.com] |
MCC consortium Austin TX Infosleuth
Technology development for consortium members [www.mcc.com] |
GeneLogic Bioinformatics Berkeley CA OPM
Object-based integration of genomic data [www.genelogic.com] |
Netbot Internet shopping [www.excite.com?]
Persistence Corporation, San Mateo CA: Persistence
object creation [www.persistence.com] |
Socratix Palo Alto CA
genomics, drug development [ |
SST Woodside CA Passgate
privacy protection in collaboration [www.2ST.com] |
Tessarae San Jose CA design and implementation
integrating heterogeeous information systems [tesserae.com>] |
Fin
Previous chapter -
Next chapter
CS99I home page.
Notes
!Godown == gudang in malay!