JIIS I3 Abstracts

Collected Abstracts for the Special Issue on Intelligent Integration of Information, Journal for Intelligent Information Systems, Kluwer Pubs., proposed publication date May 1996. Draft collection by the issue editor, Gio Wiederhold, 26 Apr 1995, updated 30 Apr 1995.

List of abstracts received (Author code: initial title term(s))
Contact addresses are listed with the abstracts. I encourage interoperation so that the best product can be produced.

AKS: Query Reformulation
BF: IDSE: An Ontology-based Environment
BFG: Tools
Bresciani: The Integration of Description Logics
BS: A Model-Based Adaptive and Scalable Architecture
CS: Scalable Parallel and Distributed Data Mining
CLS: Reasoning Support
Colomb: Semantic Problems
CYYMCL: CoBase:
DD: A Systematic Design for Consistency Management
DEFFP: Network-based Information Brokers
DWW: Vertical Information Management
EHMP: Component Database Integration
EKSSWRSS: Neper Wheat:
FFPR: Collaborative Ontology Construction
Fomichov: Powerful and Flexible Mathematical Means
GIMB: An Approach to Define an Optimal Cache
GMS: Ontologies, Contexts, Mediation
Goldschmidt: Industrial Infrastructure
GS: Rule Discovery
HS: Intelligently Integrating Multiple Databases
Kapetanios: A Scientific Information System
KN: Values
Kuokka: Matchmaking
LC: Semantic Integration
LP: The Distributed Interoperable Object Model
MBF: Interactive Support
MJMMMW: Indexed Ontologies
MSW: A Geospatial
MP: Formalisation
Nourani: Abstract Intelligent Implementations
Nourani: Implementation and references.
Nourani: Mediator Transformation
Nourani: Validation
QR: Database Interoperation
RR: Intelligent Integration of Environmental Information
Scherer: Combination of Neural Networks
SH: An Intelligent Assistant
SK: An Active Agent
SWK: Dual-use
SYL: A Common Language
TS: Schema and Query Transformations
Wang: Consistency

Semantic Problems in Interoperability

Robert M. Colomb (Department of Computer Science , Distributed Systems Technology Cooperative Research Centre, The University of Queensland, Queensland Australia 4072)
{colomb@ii.pw.edu.pl}

A goal in information systems interoperability research is tight coupling: to be able to construct from a collection of databases a system which appears to the user as a single database, seamlessly integrating the component databases. The critical problem in achieving interoperability is semantic heterogeneity. This paper argues from the perspective of text databases that the conditions under which semantic heterogeneity can be overcome are very special. Further, the operational conditions under which tight coupling can be achieved are also very special. As a consequence, the goal of tight coupling is often not possible, and sometimes not even desirable. However, a limited tight coupling can be achieved by following the exchanges of information among organizations in their normal course of business, although the resulting network of limited tight couplings do not necessarily result in a useful global schema.

Values of an Information Broker

Dimitris Karagiannis, Karin Neuhold (Tech,Un. Vienna, Austria)
{neuhold@dke.univie.ac.at}

In the world of networks today it seems necessary to have some ways of establishing how high the cost of gaining information about specific subjects may be and to be able to weigh whether the 'worth' of the knowledge relative to the 'cost' of the knowledge is high enough for it to be taken into account and acquired. For this I propose that it is necessary to have some way to calculate this relation and to keep the information on hand for future Knowledge Acquisition wishes. In sthe sense of it being unecessary to keep all knowledge on hand locally, unless the third weight, which should be calculated, the 'frequency' of needed access to this information is sufficiently relevant in comparison to the weight versus the cost that it may make it necessary. Further weights which should be taken into account are: the persistence of the information in the distributed sources as well as the rate of change in the information. The process of acquiring all the relevant weights are a part of the process of knowledge abstraction.

Dual-use Technology Insertion Decision Support System (DTIDS)

"Charles P. Satterthwaite" {sattercp@aa.wpafb.af.mil}
"James S. Williamson" {williamsonjs@am.avlab.mts.wpafb.af.mil}
"Tim Kearns" {kearnstg@am.avlab.mts.wpafb.af.mil} (Wright Patternson AF Lab., OH)

We are considering doing a paper relating the use of I3 in a project called Dual-use Technology Insertion Decision Support System (DTIDS), a project we are doing for the Air Force related to the 125,000 Air Force ground Vehicle Fleet, the move to integrate that fleet as quickly as possible to alternate fueled vehicle (AFV) scenarios, and the complexity of managing the information for acquisition, technology insertion, system performance measures to name a few. AFV System Program Manager, Carl Perazzola from Warner Robins AFB, is very tied into ARPA projects in support of his interests, and is using our Program Management skills and technologies to extend his own management capability. I have seen a natural fit for I3, and have asked Sham Navathe and his people to tune their work in support of this effort. I have also appraised Dave Gunning of the situation, and we are considering a I3 Workshop for this Program at Warner Robins in the near future.
There's just a little feedback for you on what I consider to be a great transition of technology from I3 to a very large Air Force need.
Beyond ground vehicles, my Section, who are primarily interested in Embedded Aircraft Applications and their support environments, are looking for applications of I3 in these areas as well. I know from its inception that I3 addresses issues related to the F-22. My organization, at the Directorate Level, Wright Laboratory Avionics Directorate in particular under Dr. Jesse Ryles and deputy Colonel Lewantowicz, are very interested in Technology Fusion. All of the programs under the Directorate are being tasked at looking at key technologies that are Fusion in nature.
We believe, that the I3 technologies are definately key Fusion Technologies. We request your support in building this case, and potentially moving I3 more aggressively into avionics applications. That starts with what you are so good at, dialogue and suggestions for colaborations.

Indexed Ontologies for the Automatic Resolution of Semantic and Media Heterogeneity in Multidatabases.

Michael Wilson, Keith Jeffrey, Colin McNee (Rutherford Appleton Laboratory)
Lachlan MacKinnon, David Marwick and Howard Williams (Herriot Watt University)
{mdw@informatics.rutherford.ac.uk}

The paper presents an approach to resolving semantic heterogeneity in multidatabases which relies upon a Knowledge Based Ontology. The ontology is built around a core of about 250 concepts to which domain specific layers are added as data sources about those domains are required. The terms in the ontology are used to express queries in a data source independent, first order logic representation language with constrained second order extensions. Data sources are also described within this representation language. Queries are expanded and normalised on the basis of user and task models before both simple and heuristic matching is performed between queries and data source descriptions in order to identify target data sources and local identifiers for the query variables. Following this conceptual interpretation, queries for each target data source are constructed at a logical level, where processing constraints (time and cost of retrieval itself) are applied to optimise performance. After queries are dispatched and data returned, remaining conflicts are resolved, data are normalised, and they are integrated into HyTime templates which grow a hypermedia web for the presentation of retrieved multimedia data to the user. The architecture of the Multimedia Information Presentation System (MIPS) which has implemented this approach is described along with discussions of the relative complexity of both the implemented system, and the supporting methods to use and maintain it. Measures of performance show the relative runtime cost of each part of the algorithm. In particular the tradeoffs made between end user and system support effort, and between run-time and off-line effort are debated for such a general and therefore evolving system.

An Approach to Define an Optimal Cache for a Multidatabase System

A. Goni, A. Illarramendi, E. Mena, J.M. Blanco (University of the Basque Country)
{jibgosaa@si.ehu.es}

Multidatabase systems allow users to query different autonomous databases with a single request. The answer for those queries must be found on the underlying databases. This answering process can be improved if some data are cached within the multidatabase system. The goal of this paper is twofold. First, to present an approach that allows the definition of an optimal cache for a multidatabase system according to a set of parameters. We show the types of objects to be cached, the cost model used to decide which ones are worth caching and the method to find the optimal set of objects to cache. Moreover, this approach updates continuously the set of parameters values and redefines periodically the optimal cache in order to reflect changes on the user requirements or in the implementation features of the underlying databases. Second, to present how cached data can be used to answer a query. Furthermore, in the paper are shown the advantages of using a Description Logic based Knowledge Representation System when defining caching techniques.
In the paper we include some graphics that show the behaviour of our caching method.

Query Reformulation for Information Integration

Yigal Arens, Craig A. Knoblock, and Wei-Min Shen (USC ISI, Marina Del Rey CA)
{arens@isi.edu}

With multiple heterogeneous information sources accessible in modern networked environments, users and applications cannot be expected to keep up with all their specific languages, organizations, contents and network locations. Yet the need for the data available from these sources remains. Users and applications would benefit greatly from the ability to formulate queries in a language that is free from any reference to specific information sources. SIMS is a system that supports such querying. This paper describes how SIMS can reformulate a high-level query that describes only the desired information into a query that makes explicit the information sources that need to be accessed and the data that needs to be retrieved from each. To perform this task SIMS uses models, declarative descriptions of the application domain and the available information sources, and reformulation operators that are used to successively rewrite portions of the query until all needed information sources are made explicit. This approach provides a flexible and extensible system for integrating heterogeneous information sources. We have demonstrated the feasibility and effectiveness of this approach by applying SIMS in the domains of transportation planning and medical trauma care.

CoBase: A Scalable & Extensible Cooperative Information System

Wesley W. Chu, Hua Yang, Henrick Yau, Mike Minock, Gladys Chow, and Chris Larson (Computer Science Department, University of California at Los Angeles)
{wwc@cs.ucla.edu}

A new generation of information systems that integrates knowledge base technology with database systems is presented for providing cooperative (approximate, conceptual, and associative) query answering. Based on the database schema and application characteristics, data are organized into Type Abstraction Hieararchies (TAHs). The higher levels of the hierarchy provide a more abstract data representation than the lower levels. Generalization (moving up in the hierarchy), specialization (moving down the hierarchy), and association (moving between hierarchies) are the three key operations in deriving cooperative query answers. Based on the context, the TAHs can be constructed automatically from databases. There is an intelligent dictionary/directory in the system that lists the location and characteristics such as context and user type of the TAHs for the system and user to select the appropriate one for relaxation. A knowledge editor is also provided to browse and edit the TAHs as well as the relaxation parameters. CoBase also provides a relaxation manager to provide control for query relaxations. An explanation system is also included to describe the relaxation and association processes and provide the quality of the relaxed answers.
CoBase uses a mediator architecture to provide scalability and extensibility. Each cooperative module, such as relaxation, association, explanation, and TAH manager, is implemented as a mediator. Further, an intelligent directory mediator is provided to direct mediator requests to the appropiate service mediators. CoBase mediators have uniform interface specifications and are interconnectable to perform joint tasks. Mediators communicate with each other via KQML. A CoBase ontology for KQML is presented in the paper.
A Geographical Information System is developed on top of CoBase. Queries can be specified graphically and incrementally on maps, greatly improving querying capabilities. Further, the relaxation process can be visualized graphically on the map. For ease in technology transfer, CoBase was implemented in C++ and uses CLIPS for knowledge representation. CoBase has been demonstrated for answering imprecise queries for transportation application and for matching medical image (X-ray, MRI) features. It has also been used for schema integration of heterogeneous databases, and for matching radar emitter signals to locate platforms. Measurements as well as metrics regarding scalability and extensibility will also be presented.

Vertical Information Management: A Framework to Support High-Level Information Requests

Lois Delcambre, Greg Washburn (Data Intensive Systems Center, Dept. of Computer Science and Engineering, Oregon Graduate Institute, Portland, OR 97291)
Mark Whiting (Battelle Pacific Northwest Laboratories, Richland, WA 99352)
{lmd@cse.ogi.edu, greg@cse.ogi.edu, ma_whiting@pnl.gov}

Vertical information management (VIM) supports decision makers working within various levels of a management hierarchy, seeking information from potentially large, distributed, heterogeneous, and federated information sources. Decision makers are often overwhelmed by the volume of data which may be relevant and collectible, but overly detailed (e.g., from the breadth of open source data). Yet, the collected information must maintain its pedigree to allow access to detail on demand. VIM structures a top-down query refinement and bottom-up information collection process. Our approach explicitly represents the abstract solution which is used to generate a representation-dependent solution to the information request. One assumption of this work is that high-level information requests may involve data that is extracted or derived from underlying information sources, as well as data that is not present in the underlying information sources (referred to as "gaps"). For a high-level information request to be issued, a more detailed specification using the representation-dependent components of the framework must be utilized.
The VIM framework has been developed in the context of the ARPA I3 program in that we provide: i) support for access to independent, heterogeneous information sources without the use of a complete global schema [1, 7], ii) query formulation services [7], iii) a recognized manual element for integration of underlying data [2, 6], and iv) operation in the context of incomplete or missing data [5]. However, the scope of our work is more narrow than the scope of the ARPA I3 program; we focus on supporting read-only access to the underlying sources, and on capturing the process used to derive the high-level information, as opposed to focusing on the automated delivery of information. We propose to use a collection of mediators from the I3 program [3, 6] and from ongoing work on context interchange [4] to accomplish the semantic and representational resolution between the data of interest, and the data contained in the particular underlying data sources. Contributions of the VIM framework include:
o separation of semantics from representation; a high-level information request, and the way it is to be constructed from base data, may be specified without the burden of the representational detail for the actual underlying data, and without being limited to the data directly stored in the underlying information sources,
o reusability of high-level information requests against different underlying data sources,
o provision of an elegant interface between the data integration problem and the information derivation problem; this makes the larger problem of providing decision makers with useful information more understandable.
o composability allows existing specifications to be pieced together for increasingly complex requests for information, and
o derivation of defensible data.
This work is motivated by the demands for arbitrary high-level information about environmental restoration and remediation on a regional and national scale. This paper will describe the VIM framework with a focus on specification of the steps for derivation of the high-level information from base data in terms of both the abstract, representation-independent and the representation-based components. We will also describe the prototype for specification and execution of high-level information requests.
References
[1]Chawathe, S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J., and Widom, J. (1994). "The TSIMMIS Project: Integration of Heterogeneous Information Sources" in Proceedings of the 100th IPSJ Anniversary meeting, Tokyo, Japan.
[2]Papakonstantinou, Y., Garcia-Molina, H., and Widom, J. (1995). "Object Exchange Across Heterogeneous Information Sources" to appear in ICDE 95.
[3]Papakonstantinou, Y., Garcia-Molina, H., and Ullman, J. (1995). "MedMaker: A Mediation System Based on Declarative Specifications (Extended Version)" submitted for publication 1995, available at http://www-db.stanford.edu/ yannis/yannis-papers.html.
[4]Sciore, E., Siegel, M., and Rosenthal, A. (1994) "Using Semantic Values to Facilitate Interoperability Among Heterogeneous Information Systems", ACM Transactions on Database Systems vol. 19-2.
[5]Quass, D., Rajaraman, A., Sagiv, Y., Ullman, J., and Widom, J. (1994). "Querying Semistructured Heterogeneous Information" unpublished memorandum, CSD, Stanford U, anonymous ftp as pub/quass/1994/querying-submit.ps.
[6]Wiederhold, G. (1992). "Mediators in the Architecture of Future Information Systems" IEEE Computer vol. 25-3, pp. 38-49. [7]"Reference Architecture" a draft developed in the November '94, ARPA Intelligent Integration of Information Workshop, available at http://www.cs.colorado.edu/ dbgroup/i3-ref-arch.html.

Interactive Support for Mapping Construction in Aid of Tool Interoperability

Dr Zahir Moosa, Dr Mike Brown and Dr Nick P Filer (Un. Manchester, GB)
{michaelb@cs.man.ac.uk}

The need for mappings between different data representations is widespread in the domain of Computer-Aided Design (CAD). Mapping is required to maintain compatibility between different versions of an evolving schema as well as for more complex problems involving semantic heterogeneity, such as support for multiple data views and tool interoperability. Currently, no well-established methodology exists to guide in the construction of mappings between incompatible representations. This paper will describe the Tool Integration Package (TIP) currently under development at the University of Manchester. TIP is built upon a generic procedural interface (GPIC) that provides highly flexible data base access as well as the means for meta-level facilities. The construction of mappings in TIP is based on a set of generic operators that a user can apply to provide structural constraints on the mapping functions. A number of modules will also be described that provide automated support for mapping construction, including name and structure analysis of the schemas being mapped between. The use of TIP ultimately leads to the encoding of a set of application specific mapping functions. Theoretical issues concerning the run-time management of these mapping functions will also be discussed. The paper will be illustrated by a number of practical examples including a tool interoperability application carried out in collaboration with a major CAD vendor.

Combination of Neural Networks with Expert Systems: A Distributed AI Approach

Andreas Scherer (University of Hagen, Applied Computer Science I, Feithstr. 140, 58084 Hagen, Germany)
{andreas.scherer@fernuni-hagen.de}

The high complexity and the inherent heterogeneity of real world problems is still one of the major challenges advanced information processing systems. Due to the necessity to use different problem solving techniques the general interest in hybrid systems is a fast growing research area as the many recently published approaches show. For supporting the integration of intercommunicating hybrids this paper suggests the use of distributed AI (DAI) techniques. Main advantages of this approach are the encapsulation of different paradigms, the separation of control and domain knowledge and the reduction of the complexity of individual problem solvers. After a discussion of the technologies used in our application (knowledge based systems, neural networks), we finish the chapter 2 reviewing related work. Because of the special importance of DAI for our argumentation we examine in chapter 3 issues and research directions in this field and conclude this chapter with the presentation of a view of DAI as an integrative paradigm. In a case study we show how neural networks can be integrated in a more general problem solving framework.In particular architectural aspects are discussed.

A Geospatial Resource Broker

Diane E. Mularz, H. Greg Smith, Ph.D., Diane Weiss (MITRE) {mularz@mitre.org}

While the Internet provides an extensive and rich collection of geospatial resources (i.e., geographic information and services), the ability to find and retrieve relevant information effectively is not currently provided. Specific challenges in this domain include: a large number of disparate collections of data and service-based resources without a cohesive method of describing and classifying them; a need for spatial reasoning and analysis for generation of desired products that may require a specific access protocol to multiple resources and post processing within a Geographic Information System (GIS); a need for acquiring a geospatial product without the need for a local GIS; difficulty in finding the appropriate geospatial resource(s) for a given request; and a need for an application level geospatial retrieval product in place of a standard browser application We are constructing a prototype architecture for the Geospatial Resource Broker that addresses some of these challenges. This architecture includes: a resource directory service capable of capturing meta-information about services, pointers to meta-data where available, and meta-knowledge about resource content; a mediator that captures and maintains resource descriptions; a collection of cooperating agents to include: accessors that know the nuances of resource structures and retrieval mechanisms, and GIS product generators that can import, manipulate, and export spatial data; and facilitators that provide intermediate services to handle a spatial request by orchestrating resource directory access and agent invocation. An initial prototype will demonstrate the ability to provide information about spatial resources and retrieve information through this intelligent broker service. It will also clarify the types of meta-data, information, and knowledge needed to support the geospatial domain.

A Common Language for Achieving Information Sharing and Program Interoperability among Heterogeneous Systems

Stanley Y. W. Su, Tsae-Feng Yu and Herman Lam (Un. Florida)
{su@cis.ufl.edu}

For achieving the interoperability among heterogenenous computing systems, the Object Management Group (OMG) proposed the use of an Interface Definition Language (IDL) for specifying object properties and operations which encapsulate the data and programs of heterogeneous systems. Although IDL is suitable for achieving program interoperability, its underlying object model is that of C++ and lacks the semantic expressiveness needed for capturing the complex structural properties and constraints found in many application data. For achieving product model and data exchange, the ISO/STEP community has introduced the information modeling language EXPRESS. In comparison with IDL, EXPRESS is richer in semantics. It allows the definition of more complex data types and their constriants. However, EXPRESS, though having an object-oriented flavor, is not an OO language. It does not capture the behavioral properties of data and is still semantically weak in expressing many association types and constraints of complex objects. This paper describes a common language which integrates the features of IDL and EXPRESS as well as the features of the association type and knowledge rule specifications offered by the Object-oriented Semantic Association Model (OSAM*). This common language, named the NIIIP Common Language (NCL), is a part of the R&D efforts of the project called the National Industrial Information Infrastructure Protocols (NIIIP). NCL is to be used for modeling all things of interest in a heterogeneous network system in terms of their 1) structural properties (attributes and association types), 2) behavioral properties (method specifications) and semantic constraints (event-condition-action rules). Its design conforms as much as possible to the two standard languages IDL and EXPRESS and its implementation is based on a language mapping to an extensible programming language K.3 whose processing is supported by an extensible object-oriented knowlege base management system OSAM*.KBMS. In NCL, frequently used simple constriants are specified by keywords, association types among object classes are specified in association specifications, behavioral properties of objects are defined in method specifications, and complex semantic constraints of various types are specified by ECA rules. Keyword and association type specifications are translated into ECA rules for processing by the rule processor of the KBMS. Additional semantic properties found in a heterogeneous environment can be easily introduced into NCL due to the extensibility feature of K.3 and OSAM*.KBMS. In this paper, we shall show how such an enriched object model and its language can be used to extend the OMG's ORB functionalities for information sharing and program interoperability in a heterogeneous environment.

Neper Wheat: Integrating Expert Systems and Crop Modeling Technology

Eman El-Sheikh, Ahmed Kamel, Kris Schroeder, & Jon Sticklen (Intelligent Systems Laboratory, Computer Science Department, Michigan State University, East Lansing, MI 48824, USA)
{kamela@cps.msu.edu}
Rick Ward, Joe Ritchie, & Urs Schulthess (Crop and Soil Sciences Department, Michigan State University, East Lansing, MI 48824, USA)
{22857MGR@msu.edu}
{Rhines%Staff%CSSDept@banyan.cl.msu.edu}
urs@psssun.pss.msu.edu}
A. Rafea & A. Salah (Central Lab for Ag Expert Systems, Agricultural Research Center, Ministry of Agriculture and Land Reclamation, Cairo, EGYPT)
{ESIC%EGFRCUVX.BitNet@pucc.princeton.edu}

In our report, we will discuss the problem solving architecture of Neper* Wheat, an expert system developed for the management of irrigated wheat in Egypt. Neper Wheat combines the products of expert systems research and crop modeling into a problem solving architecture that addresses the various aspects of wheat crop management. This includes varietal selection, planting/harvest date selection, sowing parameters decisions, insect/disease/weed identification and remediation, irrigation/fertilization management and harvest management. Specifically, from the field of expert systems research, we adopt the Generic Task Approach to expert systems development pioneered by Chandrasekaran et al. (1986), as well as the Knowledge Level Architecture ideas proposed by Sticklen (1989). From the domain of wheat crop modeling, we utilize CERES Wheat (Ritchie et al. 1985) as a dynamic knowledge-base to provide predictions of the crop's behavior. The Generic Task (GT) Approach allows developers of knowledge-based systems to tackle complex problems through a method of task decomposition and mapping of appropriate problem solving tools to each identified subtask. The Knowledge Level Architecture (KLA) provides a means of bringing together the problem solving tools into one integrated architecture. The KLA proposes that a problem solver assigned to a task be viewed as a cooperating agent. Communication channels are then established between any two agents working together. These communication channels define requests for service and appropriate responses between the two cooperating agents. Normally, the cooperating agents come from the Generic Task tool set of identified task types. However, the problem of wheat crop management requires precise predictions of the crop's behavior given the farmer's circumstances and crop management decisions imposed by the developing plan. These quantitative predictions are best performed by proven crop simulation technology. Therefore, our system employes CERES Wheat, a well established wheat crop model, to perform this predictive task.
* The term Neper comes from an early Egyptian god of agriculture.
Chandrasekaran, B. (1986). Generic Tasks in Knowledge-Based Reasoning: High-Level Building Blocks for Expert System Design. IEEE Expert, 1(3), 23-30.
Ritchie, J. T., Godwin, D. C., & Otter-Nacke, S. (Ed.). (1985). CERES Wheat. A Simulation Model of Wheat Growth and Development. College Station, Texas: Texas A&M University Press.
Sticklen, J. (1989). Problem Solving Architectures at the Knowledge Level. Journal of Experimental and Theoretical Artificial Intelligence, 1, 1-52.

Formalisation of semantic schema conforming and merging

Peter McBrien and Alex Poulovassilis , Dept. of Computer Science, King's College London, Strand, London WC2R 2LS
{{alex,pjm}@dcs.kcl.ac.uk}

Several methodologies for the semantic integration of databases have been proposed in the literature. These often use a variant of the Entity-Attribute-Relationship model as the common data model. To aid the schema conforming and merging phases of the semantic integration process various transformations have been defined which map between EAR representations which are in some sense equivalent.
Our work aims to formalise previous approaches by
- adopting a semantically minimal common data model,
- formally defining the notion of a valid transformation of one schema into another, and
- identifying a minimal set of such transformations.
The common data model we use is a binary relational one comprising entity types and binary relationships between them, including inclusion relationships. By minimal we mean that any valid transformation can be defined as a sequence of transformations from the minimal set.
We differentiate between transformations which are general to all extensions of the schema and those which require knowledge-based reasoning since they apply only for certain extensions. This distinction serves to enhance the performance of transformation tools since it identifies which transformations must be verified by inspection of the schema extension. It also serves to identify when intelligent reasoning is required during the schema integration process.

The Distributed Interoperable Object Model and Its Application to Large-scale Interoperable Database Systems

Ling Liu (Department of Computing Science, University of Alberta, GSB 615, Edmonton, Alberta, T6G 2H1 Canada)
{lingliu@cs.ualberta.ca}
Calton Pu (Dept. of Computer Science and Engineering, Oregon Graduate Institute, P.O.Box 91000 Portland, Oregon, 97291-1000 USA)
{calton@cse.ogi.edu}

A large-scale interoperable database system operating in a dynamic environment should provide uniform access user interface to its components, scalability to larger networks, evolution of database schema and applications, flexible composibility of client and server components, and preserve component autonomy. To address the research issues presented by such systems, we introduce the Distributed Interoperable Object Model (DIOM). DIOM's main features include the explicit representation of and access to semantics in data sources through the DIOM base interfaces, the use of interface abstraction mechanisms (such as specialization, generalization, aggregation and import) to support incremental design and construction of compound interoperation interfaces, the deferment of conflict resolution to the query submission time instead of at the time of schema integration, and a clean interface between distributed interoperable objects that supports the independent evolution and management of such objects. To make DIOM concrete, we outline the Diorama architecture, which includes important auxiliary services such as domain-specific library functions, object linking databases, and query decomposition and packaging strategies. Several practical examples and application scenarios illustrate the usefulness of DIOM.

Reasoning Support for Schema Integration

Tiziana Catarci, Maurizio Lenzerini, Giuseppe Santucci, Dipartimento di Informatica e Sistemistica, Universita' degli Studi di Roma "La Sapienza", Via Salaria 113, I-00198 ROMA, Italy
{lenzerini@assi.dis.uniroma1.it}

In the last decade, several knowledge representation formalisms and reasoning techniques based on classes and relations have been investigated. This paper deals with the idea of exploiting such techniques in the integration of database schemas.

Consistency Checking In Multidatabases Through Knowledge-Based Entity Identification

Ke Wang, Department of Information Systems and Computer Science, National University of Singapore, Lower Kent Ridge Road, Singapore, 0511
{wangk@iscs.nus.sg}

Traditional approaches to database integration require that a common key exist in all participating relations that model equivalent entities in the real-world, therefore, compromising the logical heterogeneity of multidatabases. Recently, a few researchers propose to use knowledge to identify equivalent entities without requiring a common key. This arises the issue of detecting potential inconsistency among data and knowledge in entity identification. We present three criteria for the consistency in this context and consider incremental test in the process of updating data and knowledge. High efficiency is obtained for update of data, and reasonable efficiency is obtained for update of knowledge. To see how practically useful the proposed framework and algorithms are, we conduct an experiment on a case study of three databases in the real lif e. In this work, all local schemas are assumed to be translated into the relational model, but they are not required to share a common key.

A Model-Based Adaptive and Scalable Architecture for Interoperable Application Systems.

Gilbert Babin (Departement d'informatique, Universite Laval, Ste-Foy, Quebec, CANADA G1K 7P4)
{babin@ift.ulaval.ca}
Cheng Hsu (Decision Sciences & Engineering Systems, Rensselaer Polytechnic Institute Troy, N.Y., USA 12180-3590)
{hsuc@rpi.edu}

The integration and management of multiple data and knowledge systems entail not only interoperability, local autonomy, and concurrent processing, but also the new capabilities of adaptiveness allowing scalable (or incremental) integration and non-standard or legacy systems, while avoiding the well-known performance problems caused by the traditional approaches to global control and management. The past decade has seen many good solutions developed fron vast worldwide efforts for many aspects of the problem; yet certain needs remain unsatisfied --- especially the new capabilities of adaptiveness. Most of the previous efforts have focused either on providing an architecture for direct interoperation among different systems (such as CORBA, Common Object Request Broker Architecture), or on developing a global model to manage these systems in the tradition of databases (such as heterogeneous and distributed DBMS's). It seems that a promising approach to the problem is making the interoperable architecture adaptive and scalable by virtue of defining it throught the global model; or, simply, the model-based architecture. Such an architecture is proposed in this paper, using the metadatabase model. The concept of metadata independence for multiple systems is presented as the basis for the conceptual design of the architecture, while the Rule-Oriented Programming Environment (ROPE) method is developed to execute the design. ROPE implements global (integration) knowledge into localized rule-oriented shells which constitute the control backbone of the interoperating architecture, and manages these shells. The metadatabase enables the shells to grow in number as well as to change its control knowledge contents. A prototype is developed to test the concepts and design.

Scalable Parallel and Distributed Data Mining by Meta-Learning

Philip K. Chan and Salvatore J. Stolfo (Columbia Un., NYC, NY)
{sal@cs.columbia.edu}

The next decade of research in high performance computing and communications promises to deliver widely available access to unprecedented amounts of constantly expanding data. It is clear that many defense and commercial applications will benefit from learning new knowledge by integrating and analyzing very large amounts of widely distributed data to uncover and report upon subtle relationships and patterns of events that are not immediately discernible by direct human inspection. Although much progress has been made in developing new and useful machine learning algorithms that learn from examples, the computational complexity of many of these algorithms makes their use infeasible when applied to large amounts of inherently and physically distributed data. In order to provide the promised ability to learn new knowledge from large amounts of information, a central problem that we call the scaling problem for machine learning, needs considerable attention. In this paper, we describe a general approach that we have come to call meta-learning. Meta-Learning refers to a general strategy that seeks to learn how to combine a number of separate learning processes in an intelligent fashion. We desire a meta-learning architecture that exhibits two key behaviors. First, the meta-learning strategy must produce an accurate final classification system. This means that a meta-learning architecture must produce a final outcome that is at least as accurate as a conventional learning algorithm applied to all available data. Second, it must be fast, relative to an individual sequential learning algorithm when applied to massive databases of examples, and operate in a reasonable amount of time. To achieve scalable learning systems by meta-learning that are efficient in both space and time, we study solutions based upon parallel and distributed computing. Experimental results achieved on a large number of alternative meta-learning strategies are reported.

A Scientific Information System for Atmospheric Research - A Place where Database Technology meets AI?

Epaminondas Kapetanios (Research Centre Karlsruhe - Technology and Environment, Institute of Applied Computer Science, P. O. Box 3640, D-76021 KARLSRUHE, GERMANY)
{nondas@iai.kfk.de}

Analysis of scientific data gathered by remote--sensing instruments should lead to the evaluation and validation of scientific hypotheses concerning natural and/or mankind caused phenomena. Hypotheses can be further exist in contest with other hypotheses (Open World Assumption) and will be based on a sufficient ground of knowledge. This can be provided by different resources, i.e., a database for measurements and observations data, a knowledge base in the form of metadata, a spatial--temporal data model for visualization purposes. New data can strengthen or lessen a scientific hypothesis. Mediators will be used for extracting and justifying scientific hypotheses by (dis)--connecting data elements from the relevant data resources under consideration. The development of the scientific information system is based upon a federated architecture.

Tools to combat heterogeneity in databases (and other information systems)

W Behrendt, N J Fiddian, W A Gray. (University of Wales College of Cardiff / COMMA Street: PO Box 916 City: Cardiff CF2 4YN Country: GB United Kingdom )
{Wernher.Behrendt@cm.cf.ac.uk, http://www.cm.cf.ac.uk}

Firstly, we describe the problem space of heterogeneity across text-based, data-based, and knowledge-based information systems, including the role of object-oriented systems within that space. This description serves as the basis for defining interoperability requirements. Secondly, we describe a collection of tools developed at Cardiff, which provide services to overcome syntactic and semantic heterogeneity in network-, relational- , and object-oriented databases and which achieve interoperability between several paradigms. The achievements sofar are set against the background of requirements described before. Thirdly, we describe an architecture which integrates these services into a standard software toolkit. Fourthly, we outline a client-server architecture to show how the services could be used in an agent- or mediator- based environment where autonomous problem-solving capabilities may be required from the toolkit.

Ontologies, Contexts, Mediation: Providing a Uniform Solution To Semantic Interoperability

Cheng Hian Goh, Stuart E Madnick and Michael D Siegel (Sloan School of Management, MIT, Cambridge MA)
{{chgoh,smadnick,msiegel}@mit.edu}

The Context Interchange strategy has been proposed as an approach for achieving interoperability among heterogeneous and autonomous data sources and receivers (Siegel and Madnick, 1991). In a recent paper (Goh, Madnick and Siegel, 1994), we have argued that the Context Interchange strategy has many advantages over traditional loose- and tight-coupling approaches proposed in the integration literature. Our goal in this paper is to present an underlying theory describing how those features can be realized. For this purpose we define a data model, called COIN (COntext INterchange), which describes (1) how domain and context specific knowledge can be represented and organized for maximal sharing; and (2) how these bodies of knowledge can be used to facilitate the detection and resolution of semantic conflicts which may arise when data is exchanged between different systems. Within this framework, ontologies exists as conceptualizations of particular domains and contexts as "elaborations" (or constraints) on existing descriptions of objects. We show that when suitably constrained, these descriptions have an elegant logical interpretation which allows knowledge originating from ontologies, contexts and user queries to be uniformly used for the detection of semantic conflicts. We conclude this paper by reporting on our experiences with an implementation which provides integrated access to multiple financial data services using the context interchange approach (Daruwala et al, 1995) and describe how this has provided valuable insights on viable strategies for transitioning this technology to production use. References:
Adil Daruwala, Cheng Hian Goh, Scott Hofmeister, Karim Hussein, Stuart Madnick and Michael Siegel. The Context Interchange Network. To be presented at IFIP WG2.6 Sixth Working Conference on Database Semantics (DS-6), Atlanta, Georgia, May 30 to June 2, 1995.
Cheng Hian Goh, Stuart Madnick, and Michael Siegel. Context Interchange: Overcoming the Challenges of Large-Scale Interoperable Database Systems in a Dynamic Environment. In Proceedings of the Third Int'l Conf on Information and Knowledge Management, pages 337--346. Gaithersburg, MD, Nov 1994.
Michael Siegel and Stuart Madnick. A Metadata Approach to Resolving Semantic Conflicts. Proceedings of the 17th International Conference on Very Large Data Bases. 1991.

Industrial Infrastructure Protocols to Support Information Mediation

Art Goldschmidt (NIIIP, c/o IBM Corp., P912, South Road, Poughkeepsie, NY, 12601-5400)
{artg@vnet.ibm.com}

The virtual enterprises of the future must address infrastructure issues such as authentication, authorization, security, recovery, etc. when providing information to users outside the private domains of its individual members. The National Industrial Information Infrastructure Protocols consortium is developing a Reference Architecture and a Reference Implementation of such an environment. Central to the this project is the role of Organizational and Resource intelligent agents that act as high level decision making elements and the role of Mediators and Negotiators who support the decision making by establishing local "infobases" that resolve the semantic mismatches that occur when enterprises each with differenct tools, terminologies, procedures and criteria, attempt to organize their process under a new shared workflow.
The NIIIP protocols integrate the standards activities of six disparate communities: the Internet Society, the Object Management Group, ISO STEP, Workflow Management Coalition, and I3, into a common set of protocols, packaged across 13 components to facilitate binding to COTS (commercial off the shelf) products. This paper will describe the overall infrastructure, as well as the protocols used in each component. The intended operation of the NIIIP infrastructure will be described using a consortium demonstration scenario."

A Systematic Design for Consistency Management across Multiple Information Resources

Lyman Do and Pamela Drew (Dept. of Computer Science, HKUST, Hong Kong)
{pam@cs.ust.hk)

An important service in an intelligent information infrastructure is the maintenance of the integrity of the stored information. Historically, database management functions, such as transaction management and data integrity constraints, were developed in part to relieve application programmers of developing software that manages the consistency of the application data. As applications that update information across multiple resources are created, consistency of the information again becomes an important management function which should be provided as a coordination service to applications.
Recently, many extended transaction management models have been proposed to maintain consistency in multidatabase environments. Similarly, interdatabase constraint managers are being developed, some of which generate multidatabase transactions or other triggering mechanisms to implement constraint enforcement. The objectives of these works are partly related, yet a consistent framework for how these coordination services should interoperate is not defined.
In this paper, we analyze the fundamental goals of constraint and transaction management in a multiple information resource environment and how the work of consistency management should be divided between these coordination services. From this analysis, we define a set of constraint rules for the design of a new system architecture, [V], for the integration of next-generation information resources.

Powerful and Flexible Mathematical Means for Uniform Building Semantic Representations Both of Discourses and Visual Images

Vladimir A. Fomichov (Department of Discrete Mathematics, Moscow State University, Telephone (home): +7 (095) 930 98 97)
{VAF@nw.math.msu.su, fomichov@sci.math.msu.su}

It is intended to set forth the principles of uniform mathematical (a) describing structured meanings of arbitrarily complicated real discourses pertaining to science, technology, medicine, law, business, etc., (b) representing knowledge about the world, (c) building semantic representations of complicated visual images.
These principles are provided by the theory of K-calculuses and standard K-languages being the central constituent of Integral Formal Semantics (IFS) - a new, highly powerful approach to the formalization of semantics and pragmatics of Natural language (NL) developed by V.A. Fomichov and represented, in particular, in several large publications in English.
The results which are to be stated in the paper open highly large and unique prospects for designing full-text databases, visual information management systems, and hybrid knowledge representation languages.

Mediator Transformation and Computing Agents

Cyrus. F. Nourani (METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

A formal heterogeneous software [28] design begins with a formal specification (see[2,8,26] for example) for the programs and stores them at a knowledge-base[25]. A formal specification is based on a formal language and makes use of some defining axioms and possibly mathematical structures, in characterizing modules or programs and defining software agents [32]. A formal specification in our approach consists of a signature specifying the type names, and the operations on types, including the rank and the arity of the operations. It also includes a set of axioms which recursively define the operations.
Such specifications capture the intended meaning of the possible computation sequences applying a given set of objects and operations, thus specifying a program[1]. Rather than tuning the specifications for efficiency, we argue that efficiency can be achieved by transformation. Note that transformations are one method of program tuning, but tuning the specifications are not achieved through program transformation techniques.
The specifications are tuned by discovering new rules or defining axioms which can subsume other such axioms, or introducing rules which can be more efficiently realized given a target implementation language. Completing an incomplete set of defining rules through implementation constraints or gradual domain knowledge learning is another important aspect of specification tuning. Meta-programming is a technique for syntax tree mapping from one language to another. We had put forth the paradigm of mapping the specifications to a high-level language for which an optimizing compiler already exists[15,21,6,8].
The programs that literally implement the mappings are defined by meta-programs [18,23], but with a prudent choice of the syntax tree maps during meta-programming, such that correct and efficient implementation maps are attained. A meta-program is a program that manipulates programs, in the sense that the object on which the meta- program functions act on are program constructs of a source language.Meta-programs work at the level of abstract syntax trees and allow us to transform the syntax trees in the specification language, to the syntax trees for a target high level language code, while allowing us to manually code-in the implementation map[1] during the translation. Implementations are homomorphic maps of the syntax trees in one language to another, such that the semantics (the intended meaning) of the specifications are preserved.
The trees defined by the present approach have function names corresponding to computing agents. The computing agent functions have a specified module defining their functionality. These definitions are applied at syntax tree implementation time. The homomorphism is defined by setting the correspondence between syntax trees in the specification language and the target language, which consists of executable code syntax trees, through the user defined meta-programs. The mapping preserves the operation on trees, in the image algebra of trees for the target language.
By the image algebra of trees we intend the image of the abstract syntax tree algebra for a source language grammar into the algebra of abstract syntax trees for the target grammar. Such an approach allows us to readily modify specifications and simply run through the mapping process to produce alternate executable code for slightly different machines or environments.

Implementation By Abstract Syntax Trees

Cyrus. F. Nourani ( METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

{Figure omitted}

An abstract implementation is essentially a homomorphism of one abstract algebra into another abstract algebra, mapping the algebra of syntax trees and the associated semantic equations for a source language (the specification language) to that of the algebra of syntax trees and their semantic equations, for a target concrete language (a programming language, such as LISP).
To give at least one example use of abstract data types in automatic programming we refer to [27]. The correctness of implementation problem is that of logically ensuring that the homomorphism is realized correctly, satisfying some required properties. The implementation homomorphism is automatically defined through the inductive properties of syntax trees, through the process of algebraic extension. By defining a meaning function for the trees defined by the type constructor functions, one automatically derives a homomorphism by algebraic extension, on the entire set of trees for the particular syntax.
For example, if h is a homomorphism on syntax trees, with h(+(t1,t2)) = +(h(t1),h(t2)), then it is sufficient to know the value of h([t1]) and h([t2]), where [t] denotes a canonical term (built entirely from type constructor functions and constants), equivalent to t. This is because by definition any term t has to be congruent(algebraically equivalent), through the equations defining the operations, to a canonical one, see ([1] and [5]).
As an example, in the process of meta-programming the algebraic specification language Compose to Prolog in [21,22], the homomorphism was implicitly realized through the process of reverse Skolemization, in which a Prolog predicate is defined for each type constructor function for a given algebraic program specification. The proof of correctness of implementation can be carried-out easily by checking that the homomorphism is realized correctly[1]. It is a process that can make use of automatic verification and the mathematical techniques, such as the ones proposed in [24] for tree mapping correctness, and by this author in [1].

Validation and Verification Of AII Systems

Cyrus. F. Nourani ( METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

Definition of VVAII: The models of reasoning [12,19] and the models of software implementing the reasoning methods have to be correct, correspond, and have implementations that could be verified and validated. The theory reported in [14] by this author sets the foundations for the development of automated techniques for VVAII.
The techniques put forth in [12] and the present paper are applicable to the development of practical expert and AI systems. There are systems that are well suited for developing an object level view of such designs, where various types of reasoning methods and communicating objects can be brought together.
The traditional software verification and validation techniques presuppose the well- known software life-cycle methods of software design and implementation. Most practical AI systems are designed and implemented by AI paradigms that consist of various reasoning models that are implemented by various types of heuristic and expert systems. The implementations are often best viewed as several paradigms only connected by mediators by message passing. It is not a case of "specified" modules that have flow charts and an envisioned knowledge-base for the modules correspondence and relations.
Validation of the resulting Abstract Implementations is an area of crucial importance as the AI techniques are applied gradually to the real problems encountered in fields such as intelligent systems, aerospace, AI for robots, and various applications. This is further defined in the next section. AIIV in this approach can be correlated to the approaches to software validation and verification. We show in [14] that we can take an object level lift to communicating modules and agents. The validation and verification then can be handled quite easier once the present proposed view of AI systems is taken.
AII Validation Objectives
The following steps (a-c) indicate the approach to the AIIV problem for the developed AI systems that we propose. The methodology is to be fully developed in phase-I of this project indicating the prototype systems that are to be built. Phase-II of the present project is proposing plans to build actual prototype automated AIIV systems that are applicable to AI and expert systems in general The following steps are proposed in developing the techniques of AIIV presented here.
There are three AIIV steps to realize:
a. AIIV at the level of reasoning model- this consists of showing that a designed AI system once viewed at the object specification level is modeling the intended system and that the reasoning models are in fact representing the intended functionality of the proposed system at the object level.
b. AIIV at the level of multi-agent mediator instantiation: this is to show that the multi-agent implmentations correspond to the AI system's mediator instantiations, and that the multi-agent representation is correct with respect to the object-level specification of (a) above.
c.AIIV at the level of system implementation: this is to show that the actual software system implemented is correct with respect to (b) above.
Concluding Comments and Future Activities
We have presented the basis for the development of a theory and practice of abstract intelligent implementation of AI and software systems. The practical part of the approach to AI and software system design presented here has actually been applied to challenging practical problems in system design by the present author. The specific techniques have been a subject of research by the present author over the past two years.
Multi agent implementations is the methodology to be applied to the design of heterogenous A.I. software and knowledge-based systems. The ontology preservation principle is an important new diemension for the AII and Artifical Intelligence Theory. The methods allow us to define algebraic synthesis of AI and software system in ways that are an exciting interplay of syntactic constructs of languages, AI, and algebra of programs. The future trends will be to develop the theory and apply the methodology to the production of most practical AI/software systems. That's AII folks.
References
[1] Nourani, C.F.,"Artificial Algebras," METAAI, Inc., March 1995.
[2] __________, Abstract Implementations and Their Correctness Proofs," 1979, JACM April 1983.
[3] Gio Wiederhold: ``Interoperation, Mediation and Ontologies''; Proc.Int.Symp. on Fifth Generation Comp Systems, ICOT, Tokyo, Japan, Vol.W3, Dec.1994, pages 33-48.
[4] Ehrig, H., H.J. Kreowski, and P. Padawittz, Programming Symposium, April 1979, Paris, Springer-Verlag Lecture Notes in Computer Science, Vol. 83, New York.
[5] Ehrig, H., H.J. Kreowski, and P. Padawittz , "Algebraic Implementation of Abstract Data Types, Concepts, Syntax, Semantics, and Correctness," Proc. 7th International Colloquium on Automata, Languages, and Programming, Noordwijkerhout, July 1980,Springer-Verlag Lecture Notes in Computer Science, Vol. 85.New York.
[6] Nourani, C.F. "Abstract Implementation Techniques For Artificial Intelligent Systems By Computing Agents- A Conceptual Overview," Proc. SERF-93, Orlando, FL., November 1993.
[7] Goguen, J.A., J.W. Thatcher, E.G. Wagner and J.B. Wright, A Junction Between Compter Science and Category Theory (parts I and II), IBM T.J. Watson Reseatch Center, Yorktown Heights, N.Y. Research Report, 1975.
[8] Nourani, C.F. " A Model Theoretic Approach To Specification and Implementation of Abstract Data Types, " 1978, revised 1979, Programming Symposium, April 1979, Paris, Springer-Verlag Lecture Notes in Computer Science, Vol. 83..
[9] Koopman, M.R.J., Spee J.W., Jonker W., Montero, L., O'Hara K., Mepham, M, Motta E., in't Veld L.J., VITAL Conceptual Modelling, VITAL Deliverable DD213, PTT Research, Groningen, December 1991.
[10] Genesereth, M.R. and Nilsson, N.J., Logical Foundations of Artificial Intelligence," Morgan-Kaufmann, 1987.
[11] Nourani, C.F.' "Parallel Module Specification'" November 1990, ACM SIGPLAN, January 1992.
[12] Nourani, C.F.,"A Multi Agent Approach To Software Fault Tolerance," September 1991, (revised 1992) FLAIRS-93, Florida AI Confenrece, April 1993.
[13] Artificial Intelligence, Special Issue on Nonmonotonic Logic, Vol 13, 1980.
[14] Nourani, C.F.,"Multi Agent A.I. Techniques For Validation and Verification," METAAI, Inc. 1992.
[15] Nourani, C.F., "Software Specification and Meta-Programming," International Software Engineering Conference, Proceedings Workshop on Software Specification and Design,Monterey, CA. 1987 .
[16] Nourani, C.F., "Equational Intensity, Initial Models, and Reasoning in AI: A Conceptual Overview," Proc. European Conference in AI, Pisa, Italy, September 1984, North-Holland.
[17] Franova, M. "A New Methodology for Automated Theorem Proving and Program Synthesis Systems, in R. Trappl (ed): Cybernatics and System Research," D. Reidel Publishing Co. Dordrecht, Holland, 1986, 879-887.
[18] Cameron and Ito, Grammar-based Definition of Meta-programming Systems," ACM TOPLAS, vol. 6, no. 1, January 1984.
[19] Nourani, C.F.,"Planning and Plausible Reasoning in Artificial Intelligence: Diagrams, Planning, and Reasoning," Proc. Scandinavian AI Conference, Denmark, May 1991, IOS Press.
[20] Gio Wiederhold et al: A Mediator Architecture for Abstract Data Access; Stanford University, Report Stan-CS-90-1303, August 1990.
[21] Nourani, C.F., "CatLog: A System for Functional Realization of Abstract Module Specifications in Prolog," Proc. 20th HICSS, Kona, Hawaii, January 1987.
[22] _________, "Efficient Realization of Algebraic Specifications in Prolog," IEEE Software, January 1986, pp. 76-77.
[23]Nourani, C.F. and K.J. Lieberherr, "Data Types, Direct Implementations, and Knowledge Representation," Proc. 19th HICSS, Honolulu, Hawaii, January 1986, Vol II, pp. 233-243.
[24] Thatcher, J.W., E.G. Wagner, and J.B. Wright, "More Advice on Structuring Compilers and Proving Them Correct," in Semantic Directed Compiler Generation, Springer-Verlag, LNCS Vol. 94, 1980. Jones, N. editor.
[25] Barstow, D.R., Knowledge-based Program Construction, Elseiver, North-Holland, 1979.
[26] Burstall, R.M. and J.A. Goguen, "Putting Theories Together to Make Specifications," Invited Talk, IJCAI 1977, Cambridge, MA.
[27] Guiho, G. "Automatic Programming Using Abstract Data Types," Keynote Address, IJCAI-83, Karlsruhe.
[28] Gio Wiederhold: ``The Roles of Artificial Intelligence in Information Systems"; in Ras Methodologies for Intelligent Systems, Lecture Notes in AI, Springer 1991, pp.38-51; republished in the Journal of Intelligent Information Systems, Vol.1 No.1, 1992, pp.35-56.
[29] Nourani, C.F. A Theory For Programming With Intelligent Syntax and Intelligent Decisive Agents, " 11th ECAI, DAI Applications to Decision Theory Workshop, Amsterdam, August 1994
[30] Thierry Barsalou and Gio Wiederhold: ``Knowledge-directed Mediation Between Application Objects and Data"; Proc. Working Conf. on Data and Knowledge Integration, Un.of Keele, England, 1989 Pittman Pub.
[31]Gio Wiederhold: Mediation in the Architecture of Future Information Systems, 1989;published in IEEE Computer Magazine, Vol.25 No.3, March 1992, pp.38-49;republished in Hurson et al: Multi-database Systems: an Advanced Solution for Global Information Sharing; IEEE Press, 1993
[32] M.R. Genesereth and S. Ketchpel: ``Software Agents"; Comm.of the ACM, Vol.37 No.7, July 1994, pp.48-53.

Abstract Intelligent Implementations, The Algebraic View

Cyrus. F. Nourani (METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

We present new techniques for design by software agents and new concepts of Abstract Intelligent Implementation of AI systems (AII). The stages of conceptualization, design and implementation are defined by AI agents and Mediators. Multiagent implementations are proposed to facilitate a software design methodology, which incorporates object level nondeterministic knowledge learning and knowledge representation methods developed in [12]. Objects, message passing actions, and implementing agents are defined by syntactic constructs, with agents appearing as functions, expressed by an abstract specification language, capable of specifying modules, agents, and their communications, [11] for example. By defining Agent Provocateurs events and activity is computed for the AII agents. The proposed abstract intelligent implementation techniques provide a basis for an approach to automatic implementation by Intelligent Free Trees from knowledge presented by an algebraic parameterized language. The object level definition for individual modules are turned to executable programs by source abstract syntax tree to target abstract syntax tree morphisms. AII techniques are applied to define an Ontology Preservation Principle. An overview for validation and verification of AI systems is presented. as a direct application of the above AII techniques.

The Integration of Description Logics and Databases

Paolo BRESCIANI (IRST, I-38050 Povo, TN, Italy)
{brescian@irst.it}

Two different aspects of data management are addressed by description logics (DL) and databases (DB): the semantic organization of data and powerful reasoning services (by DL), and their efficient management and access (by DB). This paper shows how assertional knowledge of DLMS and data of DBMS can be uniformly accessed. Our extended paradigm integrates the separately existing retrieving functions of DL and DB in order to allow, via a query language grounded on a DL-based schema knowledge, the uniform formulation and answering of queries, for retrieving data from mixed knowledge/data bases. In this way advantages of DL and DB are put together. Thus, this technique can be used in DLMS for the efficient management of large amounts of data, and in DBMS for the semantic organization of unstructured, and eventually heterogeneous and distributed, databases.

IDSE: An Ontology-based Environment for the Integration of Information and Services.

Tom Blinn, Florence Fillion ( Knowledge Based Systems, Inc. One KBSI Place, 1500 University Drive E. College Station, TX 77840-2335)
{tblinn@kbsi.com, ffillion@kbsi.com)

Significant roadblocks to intelligent information and services integration include: (1) misinterpretation of data across contexts, applications, and users; (2) lack of seamless function integration among distributed applications; and (3) lack of support for change propagation across enterprise contexts. The Integrated Development System Environment (IDSE) is a distributed computing environment that provides automated support for both information and function integration among distributed applications. The foundation for both types of integration is the use of ontologies. Information integration is achieved through the interpretation of data and the dynamic propagation and enforcement of constraints based on ontological descriptions. Function integration is achieved through an Integration Service Manager (ISM) that can process service requests >From applications based on ontological knowledge of the tools integrated into the environment. This paper describes the implementation of the IDSE and explains how the environment supports the intelligent integration of both information and services. Successes as well as problems and limitations encountered during the implementation of the IDSE are also discussed.

Semantic Integration in Heterogeneous Databases

Wen-Syan Li, Chris Clifton (Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, Illinois, 60208-3118)
{{acura,clifton}@eecs.nwu.edu}

The first step in interoperating among heterogeneous databases is semantic integration: Producing metadata that describes relationships between attributes or classes in different database schemas. The process of semantic integration can not be "pre-programmed" since the information to be accessed is heterogeneous. Intelligent information integration involves automatically extracting semantics, expressing them as metadata and matching semantically equivalent data elements. Semint (SEMantic INTegrator) is a system prototype of a mediator to assist in semantic integration developed at Northwestern University. Semint supports access to a variety of database systems and utilizes both schema information and data contents to produce matching rules between database schemas. In Semint, the knowledge of how to match equivalent data elements is "discovered", not "pre-programmed".
In this paper we will provide theoretical background and implementation details of Semint. Experimental results from running Semint on large and complex databases will be presented. We discuss the effectiveness of different types of metadata (discriminators) in determining attribute similarity. We also introduce a framework for a dynamic semantics-based query language for multidatabase systems and discuss other applications (such as digital libraries) that could use Semint as part of a complete semantic integration service.

Intelligent Integration of Environmental Information

Wolf-Fritz Riekert and Thomas Rose (FAW - Research Institute for Applied Knowledge Processing, Ulm, Germany)
{riekert@faw.uni-ulm.de}

The paper will contain technical approaches to an intelligent integration of information for the purpose of environmental protection including

mediators and facilitators to provide customers with various resources such as geographical data, multimedial documents or simulation services,
meta-services to locate these data and services in the network,
filter and broker techniques to identify relevant resources and to improve the format of presentation,
a local workspace at the customer site to integrate information from various sources and compile it in the form of new bodies of information.

This approach will be illustrated by examples of implementations for the Environmental Information System Baden-Wü rttemberg.

An Intelligent Assistant for Conceptual Data Source Selection and Integration

David Steier and Scott Huffman (Price Waterhouse Technology Centre)
{{steier,huffman}@tc.pw.com}

This paper describes an intelligent assistant that is designed to help information specialists select and combine data sources to produce new information services. Rather than attempting to fully plan out and answer queries autonomously, the assistant provides interactive knowledge-based support for a specialist designing a query plan. The assistant helps its user deal with the heterogeneity of data sources at two levels. First, it deals with schema heterogeneity by describing sources' content in the vocabulary of a central conceptual ontology. Users refer to concepts in this ontology, and the assistant finds the source(s) needed to retrieve or construct those concepts. Second, it deals with instance-level heterogeneity -- where data items in sources refer to the same concept instance differently -- by providing an extended set of relational operators to be used in query plans. In addition to standard relational operators (equi-join, select, etc.), the assistant provides heuristic operators such as heuristic join [Huffman&Steier95], that use heuristic matching to integrate sources with instance-level heterogeneity. Finally, the assistant makes use of meta-information about sources such as access cost and accuracy in building query plans.

Intelligently Integrating Multiple Databases: The Merge/Purge Problem.

Mauricio A. Hernandez and Salvatore J. Stolfo (Department of Computer Science, Columbia University)
{sal@cs.columbia.edu}

Integrating multiple sources of information into one unified database requires more than structurally integrating diverse database access methods and data models. In applications where the data is corrupted (incorrect or ambiguous), the problem of integrating multiple databases is particularly challenging. We call this the "merge/purge" problem. The key to successful solving merge/purge depends on "semantic integration" which requires a means of identifying "similar" data from diverse sources. We use a rule program that declaratively determines when two pieces of information are similar and represent some aspect of the same domain entity. However, since the number and size of the data sets involved may be large, the number of records to be compared at a time by the rule program must be limited to a small number of "good" candidates. Also, large-scale parallel and distributed computing systems may be the only hope for achieving good functional performance in a reasonable amount of time with acceptable cost. In this paper we describe the "sorted neighborhood" method to solve merge/purge and provide experimental results that suggest this method will work well in practice (reasonable execution times and accuracy of the results). As expected, a tradeoff between accuracy and execution time exists. We explore means of improving the accuracy of the results without severely affecting the execution time of our algorithms.

Rule Discovery for Instance-Level Database Integration

M. Ganesh, Jaideep Srivastava (Dept. of Computer Science, University of Minnesota, Minneapolis, MN55455)
{ganesh@cs.umn.edu}

Research in database integration has been directed for the most part at resolving schema-level incompatibility issues. However, integrating all records which represent the same real-world entity is an important task in performing database integration, which has not been addressed much in the literature. A common identification mechanism for similar records across heterogeneous databases is usually not available. Entity identification, i.e., the task of integrating records from different databases that represent the same entity, in such cases can be performed by examining the relationships between various attribute values among the records. This process makes use of additional knowledge of data semantics available to the users familiar with the data. We propose the use of distances between attribute values as a measure of closeness between the records they represent. Record matching conditions for entity identification can then be expressed as a suitable combination of these pairwise attribute distances. Using a distance-based approach for matching records allows the design of more efficient and effective methods for entity identification. Due to various data sources having been developed independently, and often by different groups of individuals, there is no easy way to obtain the record matching conditions. Our approach uses knowledge discovery techniques to automatically derive these conditions (expressed as decision trees) from the data. In this paper we describe the distance-based framework for performing the instance-level integration of databases. The results we obtained from performing entity identification on real-world databases from the telecommunication industry are presented. The results from our experiments demonstrate that our approach is highly effective for performing entity identification.

Database Interoperation through Query Mediation

Xiaolei Qian and R. A. Riemenschneider (SRI International, Menlo Park CA)
{qian@csl.sri.com}

We present a query mediation approach to the interoperation of autonomous heterogeneous databases containing data with semantic and representational mismatches. We develop a mediation architecture of interoperation that facilitates query mediation, and formalize the semantics of query mediation. The main contributions are the automated mediation of queries between databases, and the separation of semantic heterogeneity from representational heterogeneity. Query mediation in heterogeneous legacy databases makes both the data and the applications accessing the data interoperable. Automated query mediation relieves users from the difficult task of resolving semantic and representational mismatches. Decoupling semantic and representational heterogeneity improves the efficiency of automated query mediation. Our approach provides a seamless migration path for legacy databases, enabling organizations to leverage off investments in legacy data and legacy applications.

Rule Discovery for Instance-Level Database Integration

M. Ganesh, Jaideep Srivastava (Dept. of Computer Science, University of Minnesota, Minneapolis, MN55455)
{ganesh@cs.umn.edu}

Research in database integration has been directed for the most part at resolving schema-level incompatibility issues. However, integrating all records which represent the same real-world entity is an important task in performing database integration, which has not been addressed much in the literature. A common identification mechanism for similar records across heterogeneous databases is usually not available. Entity identification, i.e., the task of integrating records from different databases that represent the same entity, in such cases can be performed by examining the relationships between various attribute values among the records. This process makes use of additional knowledge of data semantics available to the users familiar with the data. We propose the use of distances between attribute values as a measure of closeness between the records they represent. Record matching conditions for entity identification can then be expressed as a suitable combination of these pairwise attribute distances. Using a distance-based approach for matching records allows the design of more efficient and effective methods for entity identification. Due to various data sources having been developed independently, and often by different groups of individuals, there is no easy way to obtain the record matching conditions. Our approach uses knowledge discovery techniques to automatically derive these conditions (expressed as decision trees) from the data. In this paper we describe the distance-based framework for performing the instance-level integration of databases. The results we obtained from performing entity identification on real-world databases from the telecommunication industry are presented. The results from our experiments demonstrate that our approach is highly effective for performing entity identification.

An Active Agent for the Intelligent Management of Application Caches

Len Seligman (The MITRE Corporation) {seligman@mitre.org}
Larry Kerschberg (George Mason University) {kersch@gmu.edu}

Many automated information systems need to: (1) transform and cache information from dynamic, shared databases, (2) reason about the current state of those data, and (3) perform long-running tasks but cannot lock the objects about which they are reasoning, so as to allow concurrent access by other applications. Many of these applications can tolerate some deviation between the state of their caches and that of the shared databases, as long as this deviation is within specified tolerances. This paper describes an active agent approach to cache management for such applications.
In previous work [SelKer93a, SelKer93b, SelKer95], we described the data consistency requirements of applications with the characteristics described above and proposed an architecture for addressing those requirements. The approach has some unique features: (1) it permits applications to specify their data consistency requirements using a declarative language, (2) it automatically generates the rules and other database objects necessary to enforce those consistency requirements, shielding the application developer >From the implementation details of consistency maintenance, and (3) it provides an explicit representation of consistency constraints in the database, which allows them to be reasoned about and changed dynamically to adapt to evolving situations.
Since the publication of [SelKer93a, SelKer93b, SelKer95], the proposed approach has been refined, implemented, and used to construct an application. This paper makes a number of new contributions. First, it introduces and formalizes quasi-view objects, which extend quasi-caching to support the transformation of objects before they are cached. Second, it presents a declarative language for specifying quasi-views, based on a modest extension to SQL. Third, it presents techniques for automatically generating (from a declarative quasi-view specification) active database rules to maintain the quasi-view. The approach represents "staleness" conditions explicitly, so that they can be queried and manipulated by users and applications. Fourth, it describes an implementation of the approach in prototype software. Finally, the paper presents a cost model that demonstrates that the approach has the potential to scale up to large databases having many quasi-views.
[SelKer93a] Seligman, L. Kerschberg, "An Active Database Approach to Consistency Management in Data- and Knowledge-based Systems," International Journal of Intelligent and Cooperative Information Systems (IJICIS), 2(2), 1993.
[SelKer93b] Seligman, L. and L. Kerschberg, "Knowledge-base/Database Consistency in a Federated Multidatabase Environment," IEEE Research Issues in Data Engineering: Interoperability in Multidatabase Systems, Vienna, Austria, IEEE Computer Society Press, 1993.
[SelKer95] Seligman, L. and L. Kerschberg, "Federated Knowledge and Database Systems: A New Architecture for Integrating of AI and Database Systems," Advances in Databases and Artificial Intelligence, Vol. 1: The Landscape of Intelligence in Database and Information Systems. L. Delcambre and F. Petry, JAI Press, 1995.

Collaborative Ontology Construction for Information Integration

Adam Farquhar, Richard Fikes, Wanda Pratt, James Rice
{Adam_Farquhar@hpp.stanford.edu}

Information integration is enabled by having a precisely defined common terminology. We call this combination of terminology and defintions an ontology. We have developed a set of tools and services to support the process.
of achieving consensus on such a common ontology by distributed groups. These tools makes use of the world-wide web to enable wide access and provide users the ability to publish, browse, create, and edit ontologies stored on an ontology server. Users can quickly assemble a new ontology from a library of modules. We discuss how our system was constructed, how it exploits exising protocols and browsing tools, and our experience supporting hundreds of users. We describe applications using our tools for achieving consensus and integrating information.

Network-based Information Brokers

Angela Dappert, Bob Engelmore, Adam Farquhar, Richard Fikes, Wanda Pratt
{Adam_Farquhar@hpp.stanford.edu}

The Internet provides dramatic new opportunities for gathering information from multiple, distributed, heterogeneous information sources. However, this distributed environment poses difficult technical problems for the information-seeking client, including finding the information sources relevant to an interest, formulating questions in the forms that the sources understand, interpreting the retrieved information, and assembling the information retrieved from several servers into a coherent answer.
We describe and demonstrate enabling technology for addressing these problems, particularly as they occur in Electronic Commerce applications. We focus on techniques needed to enable a marketplace of network-based information brokers that retrieve information about services and products whose descriptions are available via the Internet from multiple vendor catalogs and data bases. The services provided by such brokers include:
- Facilitating a human or computer client in the task of formulating a query in a domain-specific vocabulary provided by the broker.
- Identifying information sources that are relevant to answering a query.
- Translating a query into the ontology and syntax required by a given information source, obtaining responses to the query, and translating the responses into the broker's ontology and syntax.
- Aggregating, presenting, and explaining the responses to a query.
Our goal is to enable vendor and buyer communities to build their own information brokers. To do this, we will solve a set of technical problems involved in brokering, embody the solutions in an information brokering architecture, build tools that facilitate the construction of brokers using that architecture, and build example brokers using the architecture and tools. It is essential to reduce the cost of integrating information sources and to provide a path that allows for incremental integration that can be responsive to client demands. We present an approach to integrating disparate heterogeneous information sources that uses context logic. Our use of context logic reduces the up-front cost of integration, provides an incremental integration path, and allows semantic conflicts within a single information sources or between information sources to be expressed and resolved.

Matchmaking for Information Agents

Daniel Kuokka
(Lockheed AI Center, Palo Alto CA) {drk@aic.lockheed.com}

Factors such as the massive increase in information available via electronic networks and the advent of virtual distributed workgroups for commerce are placing severe burdens on traditional methods of information sharing and retrieval. Matchmaking proposes an intelligent facilitation agent that accepts machine-readable requests and advertisements from information consumers and providers, and determines potential information sharing paths. We argue that matchmaking permits large numbers of dynamic consumers and providers, operating on rapidly-changing data, to share information more effectively than via current methods. This paper introduces matchmaking, as enabled by knowledge sharing standards like KQML, and describes the SHADE and COINS matchmaker implementations. The utility and initial results of matchmaking are illustrated via example scenarios in engineering and consumer information retrieval.

Component Database Integration in a Federated Environment

Ernst Ellmer(1), Christian Huemer(1), Dieter Merkl(2), Guenther Pernul(1)
( (1)University of Vienna, Institute of Applied Computer Science, Liebiggasse 4, A-1010 Wien, Austria) { {ee|ch|guenther}@ifs.univie.ac.at}
( (2)Vienna University of Technology, Institute of Software Technology, Resselgasse 3, A-1040 Wien, Austria ) {dieter@ifs.tuwien.ac.at}

Determining the correspondences between different database schema specifications represents the most difficult and time-consuming activity that is to be performed during the construction of an interoperable database schema. Such correspondences may be the source for conflicts when integrating the schemas and thus must be detected and resolved. A manual inspection of the class definitions in each database and a comparison with each class definition in the other participating databases may result in an almost endless process. Moreover, this process of schema comparison is a purely manual activity so far. To support a federation manager during this activity we propose a computerized tool to extract the semantics from schema definitions, to transform them into a unique vector representation of each class, and to use the class vectors to train an artificial neural network in order to determine categories of classes. The output of the tool is a `first guess' which concepts in schemas may be overlapping or which concepts do not overlap at all. This may be of tremendous value because the designers are relieved from the burden of manual inspection of all the classes and can direct their focus on classes grouped by the tool into the same category.

Schema and Query Transformations for Federated Databases

Z. Tari and J. Stokes (School of Information Systems, Faculty of Information Technology, Queensland University of Technology, Brisbane, Australia)
{zahirt@icis.qut.edu.au}

Interoperability between databases provides a uniform way to access data stored over different sites. An object-oriented data model is generally used to reduce heterogeneity amongst the database components (e.g., [18,19,20]), in which a global schema is defined from an integration of local schemas. A user expresses his/her requirements using the local query language(s), and these are then translated to the global schema. In this paper we address these transformations. We provide a framework which transforms a relational schema into an object-oriented schema identifying implicit knowledge contained within relations, keys and referential integrity constraints. To do so, a relational schema is classified into three subsets of relations to reflect the different concepts of an object-oriented schema:
. Base relations: they are relations which are independent of other relations of the database. They are translated directly into re-usable classes.
. Dependent relations: they express relationships between two bases relations. They generally simulate either binary relationships, such as aggregations, or simple inheritance relationships.
. Composite relations: these relations are the generalization of dependent relations and express relationships among different relations. A composite relation simulates either an association class or multiple inheritance between existing classes.
Based on the above classification, appropriate translation rules and algorithms are provided to generate a sub-schema of the global schema. Since queries on a global schema are expressed using local query language(s), relational algebraic expressions are also transformed into an object-oriented model. To do so, every algebraic expression is decomposed according to the types of relations involved in the expression (such us base, dependent and composite relations). The result of the decomposition is a tree, called algebraic tree, in which
- a node represents a subquery which relates to a single object-oriented class (i.e., class local query), and
- an edge of the tree models the dependencies between the subqueries.
Every algebraic tree is implemented as a set of procedural methods. Indeed, a node is implemented as a method using predefined methods of the corresponding class (e.g., get() and set() methods). An edge between two nodes is implemented as a calling relationship between the methods related to the nodes of the algebraic tree. Appropriate algorithms which generate methods for relational algebraic expressions are also provided.
OVERVIEW OF THE APPROACH
Cooperation between autonomous databases has been an area of great interest in the last a few years. This situation is called interoperability and the system which manages this interoperability is called a federated database system [18,8]. To allow interoperability, a "rich" data model is used as a canonical model where all local information can be expressed using the concepts of this model. Object-oriented models are generally considered as "good" canonical models because they provide richer abstractions than that of the existing models (e.g., the relational model) [18,19,20]. Each local database system (LDS) supports export/import mechanisms to the canonical model, and LDSs use transformations that translate schema (or sub-schema), data or queries to/from the canonical model. This paper addresses these transformations and provides a framework which allows the generation of o-o schema and algebraic operations from a relational database.
The translation between data models has been addressed by several researchers. Initially Zaniolo [24] developed a tool that automatically generates relational schemas from CODASYL schemas. The approaches provided in [13,5] are concerned with transformations between extensions of the ER (Entity Relationship) and the relational data models. Lien [11] described mappings from the hierarchical to the relational data model. Tschritzis and Lochovsky [17] provided a summary of different types of mappings. With the advanced research in federated databases, the translations have became a key issue because of the necessity to access heterogeneous information. Some translation research have been proposed in the context of federated databases (e.g., [4,7,12]). Castellanos and Saltor, in [4], have used enrichment techniques to identify o-o constructs from a relational schema. Ling [12] has proposed a similar approach as [4] however with some extensions to aggregation relationships.
The above existing translation approaches are useful, however their usability is limited in the context of o-o databases. Most of these approaches do not provide a translation framework that is consistent with the object paradigm. The following is a summary of the problems with the existing translation approaches. Most of the approaches
. do not fully take into account the complexity of o-o models. Generally they provide a "partial" mapping of relational schemas in which few concepts of o-o schemas are covered (e.g., class identification). Also, (Simple and multiple) Inheritance and other forms of aggregations, such us associations, are not considered.
. still use the relational "philosophy" for building an o-o schema. However, in general, the o-o philosophy is founded on an iterative and incremental design approach (e.g., Booch [3], Coad & Yourdon [23] and so on) in which the final o-o design is a refinement of the initial design.
. address only some aspects of relational database applications. Other aspects, such as algebraic expressions, are not translated.
In this paper we propose a translation methodology that generates an o-o database and overcomes the problems described above. It is worth mentioning that the proposed methodology results in revealing implicit semantics within relational specifications which are subsequently made explicit - as much as possible - within the target o-o database. The proposed work is an extension of the results for method translation [19,21,14], in which a general translation framework will be proposed and will include both schema and algebra translation.
. The mapping process we propose is consistent with the o-o "philosophy" (e.g., Booch [3], Coad & Yourdon [23] in the sense that the building of an o-o application is an incremental and iterative process in which the final design is obtained by successive refinements. We firstly identify those relations that will serve as kernel classes for the building of the whole o-o schema. Such relations are called base relations and will translated to o-o classes. In the second step, those relations that simulate binary relationships (i.e., references) between classes are identified. Nested binary relationships are also identified at this step. These nested binary relationships simulate nested aggregations between o-o classes and at the same time "behave" as ternary relationships in the relational application. Relations which represent simple or nested binary relationships are called dependent relations. These relations are either translated as references or inheritance relationships between classes.
The final step of the translation is concerned with the relations that simulate either multiple inheritance or association between classes. These relations are called composite relations. They are the relations which remain in a schema after all the base and dependent relations are translated. Composite relations are translated into either associations or multiple inheritance.
. After an o-o schema is generated from a relational schema by following the steps described above, the algebraic queries are then translated. The paper focuses on the following three relational operations: selection, projection , and join. The mapping of these operators is closely dependent on the type of the relations specified within the algebraic expressions. For instance, if an algebraic expression uses a composite relation, then the appropriate sub-queries are generated on the classes that simulate the composite relation. Every decomposition of an algebraic operation produces a tree, called an algebraic tree, in which every . a node of the tree relates to a sub-operation on an o-o class. . an edge represents a calling relationship between sub-operations.
Every algebraic tree is translated as a procedural method. To do this, the nodes are translated into local methods by using predefined methods, such as the get() and set() operations of classes. The edges are implemented as calling relationships between two local methods embedded in the nodes of algebraic trees.
The introduced algebraic trees facilitate the implementation of relational operations in object-oriented databases. In fact if a complex algebraic operation is defined on a set of relations, algebraic trees are derived for every selection, projection, and join operation. The nodes of these trees are typed by the classes to which they relate. The nodes of the algebraic trees which relate to a common type are merged to form more complex nodes.
ORGANISATION OF THE PAPER
In this paper we develop the above translation framework. The next section introduces the preliminary definitions and notation. Sections 3 provides concepts and rules to generate an object-oriented schema from a relational schema. Translation rules for relational algebraic expressions are described in section 4. In section 5 we propose algorithms that generate object-oriented specification of a relational database. Finally, section 6 concludes on the extensions of the provided approach.
REFERENCES
[1] A. Anderson, W. Caelli, D. Longley, V. Murthy, M. Papazoglou and Z. Tari: Risk Analysis Project. Queensland University of Technology, Information Security Research Center, 1994. Project Leader: Dr. Zahir Tari. Six volumes have been produced: The Security Architecture (Vol 1), Risiko Daten Speicher (RDS) (Vol. 2), Platform Domain (Vol. 3) Information Assets and Process Domain (Vol. 4), Mappings (Vol. 5) and QUT-NT Prototype for Risk Analysis (Vol. 6).
[2] M. Atkinson, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, and S. Zdonik: The Object-Oriented Database System Manifesto. Proc. of 1st Deductive Object-Oriented Database Conf., Kyoto, 1989.
[3] G. Booch: Object-Oriented Analysis and Design. Addison Wesley, 1994.
[4] M. Castellanos and F. Saltor: Semantic Enrichment of Database Schemas: An Object Oriented Approach. Proc. of the 1st Workshop on Interoperability in Multidatabase Systems, April 1991, Kyoto.
[6] S. Ceri: Methodology and tools for database design. North-Holland, Amsterdam, 1983.
[7] U.S. Chakravarthy: Semantic Query Optimisation in Deductive Databases. Ph.D. Thesis, Dept. of Computer Science, University of Maryland, 1986.
[8] L.A. Kalinichenko: Methods and Tools for Equivalent Data Model Mapping Construction. Proc. of EDBT, March 1990, Venise.
[9] First International Workshop on Interoperability in Multidatabase Systems. April 1991, Kyoto.
[10] M.R. Genesereth, N.P. Singh and M.A. Sayed: A Distributed and Anonymous Knowledge Sharing Approach to Software Interoperation. Proc. of the FGCS'94 Workshop on Heterogeneous Knowledge-Bases, Tokyo, Dec. 1994.
[11] M. Jarke: external Semantic Query Simplification, A Graph Theoretic Approach and its Implementation in Prolog. In Expert Database Systems, Kerschberg (ed.), Benjamin/Cummings Publishing Co, 1985.
[12] Y. Lien: Hierarchical schemata for relational databases. ACM Trans. Database Syst. 6, pp. 48-69.
[13] L.L. Yan and T.W. Ling: Translating Schema With Constraints into OODB Schema. North Holland, "In Semantic of Interoperable Systems", D. K. Hsiao, E. J. Neuhold and R. S. Davis (eds.), 1993.
[14] M. Markowitz and V.M. Shoshani: On the correctness of representing extended Entity-Relationship structures in the relational model. Proc. of Int. Conf. on the Management of Data, Portland, 1989.
[15] M. Papazoglou, Z. Tari and N. Russel: Object Oriented Technology for Inter-Schema and Language Mappings. To appear as chapter book, O. Bukhres and A. K. Elmagardmid (eds), "Object Oriented Multidatabase Systems: A Solution for Advanced Applications", Prentice Hall, 1994.
[16] S. Shenoy and Z. Ozsoyoglu: A System for Semantic Query Optimisation. Proc. of the ACM- SIGMOD Conference on Management of Data, 1987.
[17] S. Shenoy and Z. Ozsoyoglu: Design and Implementation of a Semantic Query Optimiser. IEEE Transactions on Knowledge and Data Engineering, 1(3), 1989.
[18] D. Tschritzis and F. Lochovsky: Data models. Chap. 14. Prentice-Hall, Englewood Cliffs, N. J.
[19] A. Sheth, J. Larson: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys, 2(3), Sept. 1990, pp. 183-236.
[20] Z. Tari: Interoperability Between Data Models. North Holland, "In Semantic of Interoperable Systems", D. K. Hsiao, E. J. Neuhold and R. S. Davis (Eds.), 1993.
[21] Z. Tari: ERC++: a data Model that Combines Objects and Rules. Proc. of Int. Conf. on Information and Knowledge Management, Washington, 1993.
[22] Z. Tari: On the Design of Object-Oriented Databases. In Proc. of Entity Relationship Approach, G. Pernul and A.M. Tjoa (eds.), Springer-Verlag, 1992.
[23] Z. Tari and M. Orlowski: A Distributed Object Kernel for Interoperable Databases. Technical report, Queensland University of Technology, Brisbane, 1995.
[24] P. Coad and E. Yourdon: Object-Oriented Analysis, Object-Oriented Design. Prentice Hall, 1990.
[25] C. Zaniolo: Multimodel external schemas for CODASYL data base management systems. In "Data Base Architecture", G. Bracchi and G. Nijssen (eds.), North-Holland, The Netherlands.

JIIS I3 Abstracts

Semantic Problems in Interoperability

Robert M. Colomb (Department of Computer Science , Distributed Systems Technology Cooperative Research Centre, The University of Queensland, Queensland Australia 4072) {colomb@ii.pw.edu.pl}

Values of an Information Broker

Dimitris Karagiannis, Karin Neuhold (Tech,Un. Vienna, Austria) {neuhold@dke.univie.ac.at}

Dual-use Technology Insertion Decision Support System (DTIDS)

"Charles P. Satterthwaite" {sattercp@aa.wpafb.af.mil} "James S. Williamson" {williamsonjs@am.avlab.mts.wpafb.af.mil} "Tim Kearns" {kearnstg@am.avlab.mts.wpafb.af.mil} (Wright Patternson AF Lab., OH)

Indexed Ontologies for the Automatic Resolution of Semantic and Media Heterogeneity in Multidatabases.

Michael Wilson, Keith Jeffrey, Colin McNee (Rutherford Appleton Laboratory) Lachlan MacKinnon, David Marwick and Howard Williams (Herriot Watt University) {mdw@informatics.rutherford.ac.uk}

An Approach to Define an Optimal Cache for a Multidatabase System

A. Goni, A. Illarramendi, E. Mena, J.M. Blanco (University of the Basque Country) {jibgosaa@si.ehu.es}

Query Reformulation for Information Integration

Yigal Arens, Craig A. Knoblock, and Wei-Min Shen (USC ISI, Marina Del Rey CA) {arens@isi.edu}

CoBase: A Scalable & Extensible Cooperative Information System

Wesley W. Chu, Hua Yang, Henrick Yau, Mike Minock, Gladys Chow, and Chris Larson (Computer Science Department, University of California at Los Angeles) {wwc@cs.ucla.edu}

Vertical Information Management: A Framework to Support High-Level Information Requests

Lois Delcambre, Greg Washburn (Data Intensive Systems Center, Dept. of Computer Science and Engineering, Oregon Graduate Institute, Portland, OR 97291) Mark Whiting (Battelle Pacific Northwest Laboratories, Richland, WA 99352) {lmd@cse.ogi.edu, greg@cse.ogi.edu, ma_whiting@pnl.gov}

Interactive Support for Mapping Construction in Aid of Tool Interoperability

Dr Zahir Moosa, Dr Mike Brown and Dr Nick P Filer (Un. Manchester, GB) {michaelb@cs.man.ac.uk}

Combination of Neural Networks with Expert Systems: A Distributed AI Approach

Andreas Scherer (University of Hagen, Applied Computer Science I, Feithstr. 140, 58084 Hagen, Germany) {andreas.scherer@fernuni-hagen.de}

A Geospatial Resource Broker

Diane E. Mularz, H. Greg Smith, Ph.D., Diane Weiss (MITRE) {mularz@mitre.org}

A Common Language for Achieving Information Sharing and Program Interoperability among Heterogeneous Systems

Stanley Y. W. Su, Tsae-Feng Yu and Herman Lam (Un. Florida) {su@cis.ufl.edu}

Neper Wheat: Integrating Expert Systems and Crop Modeling Technology

Formalisation of semantic schema conforming and merging

Peter McBrien and Alex Poulovassilis , Dept. of Computer Science, King's College London, Strand, London WC2R 2LS {{alex,pjm}@dcs.kcl.ac.uk}

The Distributed Interoperable Object Model and Its Application to Large-scale Interoperable Database Systems

Ling Liu (Department of Computing Science, University of Alberta, GSB 615, Edmonton, Alberta, T6G 2H1 Canada) {lingliu@cs.ualberta.ca} Calton Pu (Dept. of Computer Science and Engineering, Oregon Graduate Institute, P.O.Box 91000 Portland, Oregon, 97291-1000 USA) {calton@cse.ogi.edu}

Reasoning Support for Schema Integration

Tiziana Catarci, Maurizio Lenzerini, Giuseppe Santucci, Dipartimento di Informatica e Sistemistica, Universita' degli Studi di Roma "La Sapienza", Via Salaria 113, I-00198 ROMA, Italy {lenzerini@assi.dis.uniroma1.it}

Consistency Checking In Multidatabases Through Knowledge-Based Entity Identification

Ke Wang, Department of Information Systems and Computer Science, National University of Singapore, Lower Kent Ridge Road, Singapore, 0511 {wangk@iscs.nus.sg}

A Model-Based Adaptive and Scalable Architecture for Interoperable Application Systems.

Gilbert Babin (Departement d'informatique, Universite Laval, Ste-Foy, Quebec, CANADA G1K 7P4) {babin@ift.ulaval.ca} Cheng Hsu (Decision Sciences & Engineering Systems, Rensselaer Polytechnic Institute Troy, N.Y., USA 12180-3590) {hsuc@rpi.edu}

Scalable Parallel and Distributed Data Mining by Meta-Learning

Philip K. Chan and Salvatore J. Stolfo (Columbia Un., NYC, NY) {sal@cs.columbia.edu}

A Scientific Information System for Atmospheric Research - A Place where Database Technology meets AI?

Epaminondas Kapetanios (Research Centre Karlsruhe - Technology and Environment, Institute of Applied Computer Science, P. O. Box 3640, D-76021 KARLSRUHE, GERMANY) {nondas@iai.kfk.de}

Tools to combat heterogeneity in databases (and other information systems)

W Behrendt, N J Fiddian, W A Gray. (University of Wales College of Cardiff / COMMA Street: PO Box 916 City: Cardiff CF2 4YN Country: GB United Kingdom ) {Wernher.Behrendt@cm.cf.ac.uk, http://www.cm.cf.ac.uk}

Ontologies, Contexts, Mediation: Providing a Uniform Solution To Semantic Interoperability

Cheng Hian Goh, Stuart E Madnick and Michael D Siegel (Sloan School of Management, MIT, Cambridge MA) {{chgoh,smadnick,msiegel}@mit.edu}

Industrial Infrastructure Protocols to Support Information Mediation

Art Goldschmidt (NIIIP, c/o IBM Corp., P912, South Road, Poughkeepsie, NY, 12601-5400) {artg@vnet.ibm.com}

A Systematic Design for Consistency Management across Multiple Information Resources

Lyman Do and Pamela Drew (Dept. of Computer Science, HKUST, Hong Kong) {pam@cs.ust.hk)

Powerful and Flexible Mathematical Means for Uniform Building Semantic Representations Both of Discourses and Visual Images

Vladimir A. Fomichov (Department of Discrete Mathematics, Moscow State University, Telephone (home): +7 (095) 930 98 97) {VAF@nw.math.msu.su, fomichov@sci.math.msu.su}

Mediator Transformation and Computing Agents

Cyrus. F. Nourani (METAAI, Inc., Cardiff By The Sea, CA) {73244.377@compuserve.com, CyrusFN@aol.com}

Implementation By Abstract Syntax Trees

Cyrus. F. Nourani ( METAAI, Inc., Cardiff By The Sea, CA) {73244.377@compuserve.com, CyrusFN@aol.com}

Validation and Verification Of AII Systems

Cyrus. F. Nourani ( METAAI, Inc., Cardiff By The Sea, CA) {73244.377@compuserve.com, CyrusFN@aol.com}

Abstract Intelligent Implementations, The Algebraic View

Cyrus. F. Nourani (METAAI, Inc., Cardiff By The Sea, CA) {73244.377@compuserve.com, CyrusFN@aol.com}

The Integration of Description Logics and Databases

Paolo BRESCIANI (IRST, I-38050 Povo, TN, Italy) {brescian@irst.it}

IDSE: An Ontology-based Environment for the Integration of Information and Services.

Tom Blinn, Florence Fillion ( Knowledge Based Systems, Inc. One KBSI Place, 1500 University Drive E. College Station, TX 77840-2335) {tblinn@kbsi.com, ffillion@kbsi.com)

Semantic Integration in Heterogeneous Databases

Wen-Syan Li, Chris Clifton (Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, Illinois, 60208-3118) {{acura,clifton}@eecs.nwu.edu}

Intelligent Integration of Environmental Information

Wolf-Fritz Riekert and Thomas Rose (FAW - Research Institute for Applied Knowledge Processing, Ulm, Germany) {riekert@faw.uni-ulm.de}

An Intelligent Assistant for Conceptual Data Source Selection and Integration

David Steier and Scott Huffman (Price Waterhouse Technology Centre) {{steier,huffman}@tc.pw.com}

Intelligently Integrating Multiple Databases: The Merge/Purge Problem.

Mauricio A. Hernandez and Salvatore J. Stolfo (Department of Computer Science, Columbia University) {sal@cs.columbia.edu}

Rule Discovery for Instance-Level Database Integration

M. Ganesh, Jaideep Srivastava (Dept. of Computer Science, University of Minnesota, Minneapolis, MN55455) {ganesh@cs.umn.edu}

Database Interoperation through Query Mediation

Xiaolei Qian and R. A. Riemenschneider (SRI International, Menlo Park CA) {qian@csl.sri.com}

Rule Discovery for Instance-Level Database Integration

M. Ganesh, Jaideep Srivastava (Dept. of Computer Science, University of Minnesota, Minneapolis, MN55455) {ganesh@cs.umn.edu}

An Active Agent for the Intelligent Management of Application Caches

Len Seligman (The MITRE Corporation) {seligman@mitre.org} Larry Kerschberg (George Mason University) {kersch@gmu.edu}

Collaborative Ontology Construction for Information Integration

Adam Farquhar, Richard Fikes, Wanda Pratt, James Rice {Adam_Farquhar@hpp.stanford.edu}

Robert M. Colomb (Department of Computer Science , Distributed Systems Technology Cooperative Research Centre, The University of Queensland, Queensland Australia 4072)
{colomb@ii.pw.edu.pl}

Dimitris Karagiannis, Karin Neuhold (Tech,Un. Vienna, Austria)
{neuhold@dke.univie.ac.at}

"Charles P. Satterthwaite" {sattercp@aa.wpafb.af.mil}
"James S. Williamson" {williamsonjs@am.avlab.mts.wpafb.af.mil}
"Tim Kearns" {kearnstg@am.avlab.mts.wpafb.af.mil} (Wright Patternson AF Lab., OH)

Michael Wilson, Keith Jeffrey, Colin McNee (Rutherford Appleton Laboratory)
Lachlan MacKinnon, David Marwick and Howard Williams (Herriot Watt University)
{mdw@informatics.rutherford.ac.uk}

A. Goni, A. Illarramendi, E. Mena, J.M. Blanco (University of the Basque Country)
{jibgosaa@si.ehu.es}

Yigal Arens, Craig A. Knoblock, and Wei-Min Shen (USC ISI, Marina Del Rey CA)
{arens@isi.edu}

Wesley W. Chu, Hua Yang, Henrick Yau, Mike Minock, Gladys Chow, and Chris Larson (Computer Science Department, University of California at Los Angeles)
{wwc@cs.ucla.edu}

Lois Delcambre, Greg Washburn (Data Intensive Systems Center, Dept. of Computer Science and Engineering, Oregon Graduate Institute, Portland, OR 97291)
Mark Whiting (Battelle Pacific Northwest Laboratories, Richland, WA 99352)
{lmd@cse.ogi.edu, greg@cse.ogi.edu, ma_whiting@pnl.gov}

Dr Zahir Moosa, Dr Mike Brown and Dr Nick P Filer (Un. Manchester, GB)
{michaelb@cs.man.ac.uk}

Andreas Scherer (University of Hagen, Applied Computer Science I, Feithstr. 140, 58084 Hagen, Germany)
{andreas.scherer@fernuni-hagen.de}

Stanley Y. W. Su, Tsae-Feng Yu and Herman Lam (Un. Florida)
{su@cis.ufl.edu}

Peter McBrien and Alex Poulovassilis , Dept. of Computer Science, King's College London, Strand, London WC2R 2LS
{{alex,pjm}@dcs.kcl.ac.uk}

Ling Liu (Department of Computing Science, University of Alberta, GSB 615, Edmonton, Alberta, T6G 2H1 Canada)
{lingliu@cs.ualberta.ca}
Calton Pu (Dept. of Computer Science and Engineering, Oregon Graduate Institute, P.O.Box 91000 Portland, Oregon, 97291-1000 USA)
{calton@cse.ogi.edu}

Tiziana Catarci, Maurizio Lenzerini, Giuseppe Santucci, Dipartimento di Informatica e Sistemistica, Universita' degli Studi di Roma "La Sapienza", Via Salaria 113, I-00198 ROMA, Italy
{lenzerini@assi.dis.uniroma1.it}

Ke Wang, Department of Information Systems and Computer Science, National University of Singapore, Lower Kent Ridge Road, Singapore, 0511
{wangk@iscs.nus.sg}

Gilbert Babin (Departement d'informatique, Universite Laval, Ste-Foy, Quebec, CANADA G1K 7P4)
{babin@ift.ulaval.ca}
Cheng Hsu (Decision Sciences & Engineering Systems, Rensselaer Polytechnic Institute Troy, N.Y., USA 12180-3590)
{hsuc@rpi.edu}

Philip K. Chan and Salvatore J. Stolfo (Columbia Un., NYC, NY)
{sal@cs.columbia.edu}

Epaminondas Kapetanios (Research Centre Karlsruhe - Technology and Environment, Institute of Applied Computer Science, P. O. Box 3640, D-76021 KARLSRUHE, GERMANY)
{nondas@iai.kfk.de}

W Behrendt, N J Fiddian, W A Gray. (University of Wales College of Cardiff / COMMA Street: PO Box 916 City: Cardiff CF2 4YN Country: GB United Kingdom )
{Wernher.Behrendt@cm.cf.ac.uk, http://www.cm.cf.ac.uk}

Cheng Hian Goh, Stuart E Madnick and Michael D Siegel (Sloan School of Management, MIT, Cambridge MA)
{{chgoh,smadnick,msiegel}@mit.edu}

Art Goldschmidt (NIIIP, c/o IBM Corp., P912, South Road, Poughkeepsie, NY, 12601-5400)
{artg@vnet.ibm.com}

Lyman Do and Pamela Drew (Dept. of Computer Science, HKUST, Hong Kong)
{pam@cs.ust.hk)

Vladimir A. Fomichov (Department of Discrete Mathematics, Moscow State University, Telephone (home): +7 (095) 930 98 97)
{VAF@nw.math.msu.su, fomichov@sci.math.msu.su}

Cyrus. F. Nourani (METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

Cyrus. F. Nourani ( METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

Cyrus. F. Nourani ( METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

Cyrus. F. Nourani (METAAI, Inc., Cardiff By The Sea, CA)
{73244.377@compuserve.com, CyrusFN@aol.com}

Paolo BRESCIANI (IRST, I-38050 Povo, TN, Italy)
{brescian@irst.it}

Tom Blinn, Florence Fillion ( Knowledge Based Systems, Inc. One KBSI Place, 1500 University Drive E. College Station, TX 77840-2335)
{tblinn@kbsi.com, ffillion@kbsi.com)

Wen-Syan Li, Chris Clifton (Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, Illinois, 60208-3118)
{{acura,clifton}@eecs.nwu.edu}

Wolf-Fritz Riekert and Thomas Rose (FAW - Research Institute for Applied Knowledge Processing, Ulm, Germany)
{riekert@faw.uni-ulm.de}

David Steier and Scott Huffman (Price Waterhouse Technology Centre)
{{steier,huffman}@tc.pw.com}

Mauricio A. Hernandez and Salvatore J. Stolfo (Department of Computer Science, Columbia University)
{sal@cs.columbia.edu}

M. Ganesh, Jaideep Srivastava (Dept. of Computer Science, University of Minnesota, Minneapolis, MN55455)
{ganesh@cs.umn.edu}

Xiaolei Qian and R. A. Riemenschneider (SRI International, Menlo Park CA)
{qian@csl.sri.com}

M. Ganesh, Jaideep Srivastava (Dept. of Computer Science, University of Minnesota, Minneapolis, MN55455)
{ganesh@cs.umn.edu}

Len Seligman (The MITRE Corporation) {seligman@mitre.org}
Larry Kerschberg (George Mason University) {kersch@gmu.edu}

Adam Farquhar, Richard Fikes, Wanda Pratt, James Rice
{Adam_Farquhar@hpp.stanford.edu}

Angela Dappert, Bob Engelmore, Adam Farquhar, Richard Fikes, Wanda Pratt
{Adam_Farquhar@hpp.stanford.edu}

Daniel Kuokka
(Lockheed AI Center, Palo Alto CA) {drk@aic.lockheed.com}

Z. Tari and J. Stokes (School of Information Systems, Faculty of Information Technology, Queensland University of Technology, Brisbane, Australia)
{zahirt@icis.qut.edu.au}