Nam (Pierre) Huyn
Pierre Huyn is a Ph.D. candidate
in the Computer Science department
at Stanford University. He worked
for several years at Hewlett-Packard Laboratories in Palo Alto, prior to
enrolling in the Ph.D. program in 1992.
Pierre received his MS in Artificial Intelligence from Stanford University
and his Diplome d'Ingenieur from Ecole Nationale Superieure des Telecommunications
de Paris, France.
View Maintenance under Incomplete Information.
A data warehouse is a collection of materialized views. These views
derive their contents from base relations residing in data sources
that may or may not be local to the warehouse. To keep the views
consistent with the base data, any change reported by the data sources
must be reflected in the views. Updating a view in response to an
update to some base relation often requires examining the other base
relations that contribute to the view (e.g., when the view is a join).
However, these base relations may be costly to access; they may even
be unavailable when the view needs to be maintained. Further, since
base relations are independently updated, they may be read in an
inconsistent state, often resulting in erroneous view updates.
We propose view self-maintenance as a way to resolve these
issues. In view self-maintenance, we maintain a view using the view
itself but only a specified subset of the base relations. With this
approach, we can minimize the cost to maintain the data warehouse,
shorten the time window during which the warehouse is inconsistent
with the updated data sources, and avoid view update anomalies due to
asynchronous base data updates.
With limited information, maintaining a view may not always be
possible. The main question is whether a view can be maintained at all
using only the given information subset. We can distinguish two
notions of self-maintainability (SM): the compile-time SM where
a view is self-maintainable independently of the view's contents and
the contents of the base relations, and under all updates of a certain
type; and the runtime SM where a specific view is self-maintainable
under a specific update and given the contents of a specified subset
of base relations. Efficient runtime view self-maintenance, is
the main focus and the main contribution of this thesis.
In particular, we need to find complete SM tests, that is, conditions
on the subset of information available for maintenance that are
necessary and sufficient to guarantee SM. These SM tests must be
efficient to evaluate, or else the purpose of avoiding costly base
data access would be defeated. Thus, we are looking for SM tests that
are expressible as efficient queries against the view itself and any
available base relation. Further, a base relation may not be
available, but constraints it satisfies often are available for free.
Ignoring these constraints may lead us to miss opportunities to
self-maintain a view. Taking full advantage of these constraints in SM
tests is another aspect of this work. Data warehouses seldom consist
of just one view. In fact, having other views available should help
maintain a given view. Thus, developing techniques to maintain
multiple views is also important.
Finally, we examine the close connection between the runtime view
self-maintenance problem and the problem of checking whether a
given update preserves global data integrity, using only a specified
subset of the relations under constraint: solutions to the latter
problem can help answer the question of whether or not a view is
independent of a given update.
Cooperative Electronic Medical Records (CEMR). Contributed to
the CEMR investigation and project definition efforts. Ported a CORBA-compliant
Object Request Broker implementation from Unix to MacOS. Defined portions
of the CEMR architecture. Designed and implemented several generic architecture
components in the CORBA/C++ binding.
Agent-Based Software Interoperability (ABSI). Designed, impoemented,
and maintained portions of a prototype ABSI system. Identified the role
of ABSI as a smart extension to the client/server architecture. Identified
business cases in enterprise integration, concurrent engineering and internet
information access. Wrote the ABSI white paper and performed a competitive
study. Identified performance-related research issues in task decomposition.
Component-Based Software Reuse. Contributed to the definition
of software reuse technology at HP. Designed, implemented, and maintained
portions of a software-bus prototype that enables component-based reuse.
Designed and implemented an extension of a distrubuted CASE environment
that enables collaborative work, and implemented a workgroup calendar system
based on the extension.
Designworld. Designed and implemented a distributed facilitator
that uses a subset of the Knowledge Interchange format and Knowledge Query
and Manipulation Language. Designed and implemented a network connectivity
layer for use in different facilitator implementations. Designd and implemented
several software agent components used in the Designworld system, some
from scratch and some by providing agent wrappers to legacy tools. Defined
a three-dimensional-geometry ontology for information sharing among Designworld
Nam (Pierre) Huyn
Computer Science Department
Stanford, CA 94305-9040