New Bases for New Data
                Omar Benjelloun, Stanford InfoLab

The nature of data is changing. Data is distributed across a multitude
of autonomous systems and applications. Data may be uncertain (e.g.,
because it comes from sensors) and may traverse several layers of
processing. Managing and integrating such complex data is difficult.

I believe this is because data has new important characteristics
(e.g., uncertainty or distribution), which are not captured by common
data models like the relational model or XML. In this talk, I will
argue that "new data needs new bases". In other words, new data models
and languages are needed that provide primitives for the new
characteristics of data, and adequate optimization techniques must be
developed to carry out the processing efficiently.

I will illustrate this approach with (i) Active XML, a model for
distributed data, based on XML and Web services, with techniques to
query and typecheck such data just as if it were plain XML, and (2)
ULDB's, a model for uncertain data and its lineage, with techniques to
efficiently evaluate queries and compute result probabilities.