Nam (Pierre) Huyn

Pierre Huyn is a Ph.D. candidate in the Computer Science department at Stanford University. He worked for several years at Hewlett-Packard Laboratories in Palo Alto, prior to enrolling in the Ph.D. program in 1992.

Pierre received his MS in Artificial Intelligence from Stanford University and his Diplome d'Ingenieur from Ecole Nationale Superieure des Telecommunications de Paris, France.

Thesis Topic

View Maintenance under Incomplete Information. A data warehouse is a collection of materialized views. These views derive their contents from base relations residing in data sources that may or may not be local to the warehouse. To keep the views consistent with the base data, any change reported by the data sources must be reflected in the views. Updating a view in response to an update to some base relation often requires examining the other base relations that contribute to the view (e.g., when the view is a join). However, these base relations may be costly to access; they may even be unavailable when the view needs to be maintained. Further, since base relations are independently updated, they may be read in an inconsistent state, often resulting in erroneous view updates. We propose view self-maintenance as a way to resolve these issues. In view self-maintenance, we maintain a view using the view itself but only a specified subset of the base relations. With this approach, we can minimize the cost to maintain the data warehouse, shorten the time window during which the warehouse is inconsistent with the updated data sources, and avoid view update anomalies due to asynchronous base data updates. With limited information, maintaining a view may not always be possible. The main question is whether a view can be maintained at all using only the given information subset. We can distinguish two notions of self-maintainability (SM): the compile-time SM where a view is self-maintainable independently of the view's contents and the contents of the base relations, and under all updates of a certain type; and the runtime SM where a specific view is self-maintainable under a specific update and given the contents of a specified subset of base relations. Efficient runtime view self-maintenance, is the main focus and the main contribution of this thesis. In particular, we need to find complete SM tests, that is, conditions on the subset of information available for maintenance that are necessary and sufficient to guarantee SM. These SM tests must be efficient to evaluate, or else the purpose of avoiding costly base data access would be defeated. Thus, we are looking for SM tests that are expressible as efficient queries against the view itself and any available base relation. Further, a base relation may not be available, but constraints it satisfies often are available for free. Ignoring these constraints may lead us to miss opportunities to self-maintain a view. Taking full advantage of these constraints in SM tests is another aspect of this work. Data warehouses seldom consist of just one view. In fact, having other views available should help maintain a given view. Thus, developing techniques to maintain multiple views is also important. Finally, we examine the close connection between the runtime view self-maintenance problem and the problem of checking whether a given update preserves global data integrity, using only a specified subset of the relations under constraint: solutions to the latter problem can help answer the question of whether or not a view is independent of a given update.

Work Experience

Cooperative Electronic Medical Records (CEMR). Contributed to the CEMR investigation and project definition efforts. Ported a CORBA-compliant Object Request Broker implementation from Unix to MacOS. Defined portions of the CEMR architecture. Designed and implemented several generic architecture components in the CORBA/C++ binding.

Agent-Based Software Interoperability (ABSI). Designed, impoemented, and maintained portions of a prototype ABSI system. Identified the role of ABSI as a smart extension to the client/server architecture. Identified business cases in enterprise integration, concurrent engineering and internet information access. Wrote the ABSI white paper and performed a competitive study. Identified performance-related research issues in task decomposition.

Component-Based Software Reuse. Contributed to the definition of software reuse technology at HP. Designed, implemented, and maintained portions of a software-bus prototype that enables component-based reuse. Designed and implemented an extension of a distrubuted CASE environment that enables collaborative work, and implemented a workgroup calendar system based on the extension.

Designworld. Designed and implemented a distributed facilitator that uses a subset of the Knowledge Interchange format and Knowledge Query and Manipulation Language. Designed and implemented a network connectivity layer for use in different facilitator implementations. Designd and implemented several software agent components used in the Designworld system, some from scratch and some by providing agent wrappers to legacy tools. Defined a three-dimensional-geometry ontology for information sharing among Designworld agents.

Publications

My recent publications.

Personal

Nam (Pierre) Huyn
huyn@cs.stanford.edu

Computer Science Department
Gates 412
Stanford University
Stanford, CA 94305-9040

Page hits: