Random sampling is an appealing approach to build synopses of large data sets
because random samples can be used for a broad spectrum of analytical tasks.
Current research mainly considers the database static; in this setting,
a sample created once remains valid for its entire lifetime. In many
applications, however, such a static view is infeasible because it does not
take into account the dynamic nature of the underlying data. In this talk,
I will briefly summarize recent research on the problem of maintaining a
random sample of an evolving dataset. As an example for the challenges of
sample maintenance and the techniques required to solve them, I will discuss
the problem of maintaining a random sample from a sliding window a data stream
defined over a recent time interval. In this setting, the main challenge is
to guarantee an upper bound on the space consumption of the sample while using
the allotted space efficiently at the same time. The difficulty arises from
the fact that the number of items in the window is unknown in advance and may
vary significantly over time, so that the sampling fraction has to be adjusted
dynamically. Within the talk, I will outline a novel sampling scheme called
bounded priority sampling (BPS), which requires only bounded space and quickly
adapts to changing data