Web-site Management with Strudel

Alon Levy, University of Washington

Abstract

The World-Wide Web is a prime vehicle for disseminating information. Consequently, Web sites are growing in size, have increasingly complex structure, and often serve information derived from multiple data sources. Managing the content and structure of such Web sites is a novel data management problem.

I will describe the Strudel system, which is the first system to apply concepts from database management systems to the problem of Web site management. In Strudel we separate three distinct tasks in building Web sites that are usually interdependent in current Web site management tools: (1) managing the data underlying the site, (2) the management of the structure of the site (i.e., specifying the data contained within each page and the links between pages), and (3) designing the graphical presentation of pages.

The key idea in Strudel is that the structure and the content of the Web site are specified declaratively in a high-level query language, StruQL. As a result, it is possible to easily restructure and modify Web sites, and to create multiple versions of a Web site from the same underlying data. The underlying declarative representation of the Web site also provides a platform for specifying and enforcing integrity constraints on sites, and for designing data warehouses for their support.

After describing Strudel and some experiences we had in using the system, I will argue the following: (1) Web-site management is an important application of semistructured data (2) StruQL, as a language for querying semistructured data has several advantages over related languages, and, most importantly, (3) Web-site management is an important field for database research.

The Strudel system was developed jointly with Mary Fernandez (AT&T Labs), Daniela Florescu (INRIA, France), Jaewoo Kang (Severa Inc.) and Dan Suciu (AT&T Labs).

Biography

Alon Levy joined the faculty of the Computer Science and Engineering Department of the University of Washington in January, 1998. Previously, he was a principal member of technical staff at AT&T (previously, Bell) Laboratories. He received his Ph.D in Computer Science from Stanford University in 1993. Alon's interests are in Database Systems, Artificial Intelligence and the interactions between the two fields. In particular, he is interested in query optimization, data integration, management of semistructured data, Web site management systems and knowledge representation. His most recent projects include the Information Manifold data integration system and the Strudel Web site management system.