New Bases for New Data Omar Benjelloun, Stanford InfoLab The nature of data is changing. Data is distributed across a multitude of autonomous systems and applications. Data may be uncertain (e.g., because it comes from sensors) and may traverse several layers of processing. Managing and integrating such complex data is difficult. I believe this is because data has new important characteristics (e.g., uncertainty or distribution), which are not captured by common data models like the relational model or XML. In this talk, I will argue that "new data needs new bases". In other words, new data models and languages are needed that provide primitives for the new characteristics of data, and adequate optimization techniques must be developed to carry out the processing efficiently. I will illustrate this approach with (i) Active XML, a model for distributed data, based on XML and Web services, with techniques to query and typecheck such data just as if it were plain XML, and (2) ULDB's, a model for uncertain data and its lineage, with techniques to efficiently evaluate queries and compute result probabilities.