Management of Space in Hierarchical Storage Systems
Shahram Ghandeharizadeh
Computer Science Department
University of Southern California
The past decade has witnessed a proliferation of repositories
whose workload consists of queries that retrieve information.
These repositories provide on-line access to vast amount of
data and serve as an integral component of many applications,
e.g., library information systems, scientific applications,
and the entertainment industry. Their storage subsystems are
expected to be hierarchical, consisting of memory, magnetic
disk drives, optical disk drives, and tape libraries. The
database itself resides permanently on the tape. Objects are
swapped onto either the magnetic or optical disk drives on
demand, and later deleted when the available space of a
device is exhausted. This behavior will generally cause
fragmentation of the disk space over a period of time,
resulting in a non-contiguous layout of disk-resident objects.
As a consequence, the disk is required to reposition its
read head multiple times (incurring seek operations) whenever
a resident object is retrieved. This may reduce the overall
performance of the system.
This presentation describes four alternative techniques to
manage the available space of mechanical devices in such
hierarchical storage systems. Conceptually, these techniques
can be categorized according to how they optimize several
quantities, including: 1) the fragmentation of disk-resident
objects, 2) the amount of wasted space, and 3) adaptation to
the evolving access pattern of an application. We identify
these factors and demonstrate their impact using a simulation
study.