[Research Groups] [Events]
Data stream processing is a topic of much current interest among database researchers. This page is meant to serve as an information-sharing resource for data stream research projects. It includes links to some research project pages and documents and notes from some meetings among data stream researchers.
The primary goal of the Aurora project is to build a single infrastructure that can efficiently and seamlessly meet the demanding requirements of stream-based applications. To this end, we are revisiting all aspects of database design and implementation, spanning from query optimization to user interfaces. Our current research focus is on the real-time data processing issues, such as QoS- and memory-aware operator scheduling, semantic load shedding for coping with transient spikes in incoming data rates, as well as novel hybrid data storage organizations that would seamlessly and efficiently combine pull- and push-based data processing.
As most current query processing architectures are already pipelined, it seems logical to try to extend them from stored files to data streams. However, two classes of query operators are impractical for processing long or infinite data streams. Unbounded stateful operators (such as join) maintain state with no upper bound, and therefore may run out of memory. Blocking operators (such as sort) read the entire input before emitting an output, and therefore might never produce a result. We believe that a priori knowledge of a data stream can permit the use of such operators. This knowledge can be expressed in the form of punctuations.
This work is part of the Niagara project.
In the STREAM project, we are reinvestigating data management and processing in the presence of multiple, continuous, rapid, time-varying data streams. We are attacking problems ranging from basic theory results to algorithms to implementing a comprehensive prototype data stream management system.
Telegraph is an adaptive dataflow system, which allows individuals and institutions to access, combine, analyze, and otherwise benefit from this data wherever it resides. As a dataflow system, Telegraph can tap into pooled data stored on the network, and harness streams of live data coming out of networked sensors, software, and smart devices. In order to operate robustly in this volatile, internetworked world, Telegraph is adaptive -- it uses new dataflow technologies to route unpredictable and bursty dataflows through computing resources on a network, resulting in manageable streams of useful information.
There was a panel session on "Processing Data Streams: Applications, Challenges and Approaches" at the 2002 ICDE conference. The panelists were David Maier, OHSU, panel chair; Michael Franklin, UC Berkeley; Johannes Gehrke, Cornell; Praveen Sheshadri, Microsoft; and Jennifer Widom, Stanford. Here are some brief notes about the issues discussed by the panel.
The first "stream meeting" was held at Stanford on March 20, 2002. Representatives from Berkeley, OHSU, and Stanford attended. Here is a list of participants.
Notes from the meeting are available in Microsoft Word and PDF formats. They are also available in (ugly) HTML. (Thanks to Sam Madden for providing the notes).
Slides from some of the presentations made at the stream meeting are available.
The 2002 SIGMOD and PODS conferences were full of data streams. There was:
The second "stream meeting" was held at Berkeley on September 27, 2002. Representatives from Berkeley, Brandeis (Aurora), OHSU, and Stanford attended. Notes from the meeting are available in HTML. Slides from some of the presentations are available at the same place as the notes. (Thanks to Sailesh Krishnamurthy and Mehul Shah for providing the notes.)
The Stream Winter Meeting 2003 was held at Stanford on January 9, 2003. SWiM included a much broader group of participants than the two prior Stream Meetings. More information is available at the SWiM web site.
A workshop on Management and Processing of Data Streams was held in conjunction with SIGMOD 2003. More information is available from the workshop web site
The fourth "stream meeting" was held at OHSU on August 7, 2003. Representatives from Berkeley, OHSU, and Stanford attended. More information is available from the meeting web page.
Last modified: Thu Oct 23 2003 by Brian Babcock.