What is Data Warehousing and What is Stanford Doing About it?

Janet Wiener

Stanford University

This talk will provide an introduction to many of the subsequent topics in CS545. Don't miss it.

A data warehouse is a repository of data integrated from multiple information sources and available for complex querying and analysis. The topic of data warehousing encompasses architectures, algorithms, and tools for (1) designing the warehouse schema; (2) creating and maintaining the warehouse; and (3) querying the warehouse data.

In this talk I will overview the major issues in data warehousing and then discuss the relevant research projects in Stanford's database group. In particular, the Whips project focuses on getting data into the warehouse: extracting data from sources, figuring out what portion of the data has changed, and computing subsequent changes to the integrated warehouse data.