Report Number: CSL-TR-92-523
Institution: Stanford University, Computer Systems Laboratory
Title: Architectural and implementation tradeoffs in the design of multiple-context processors
Author: Laudon, James
Author: Gupta, Anoop
Author: Horowitz, Mark
Date: May 1992
Abstract: Multiple-context processors have been proposed as an
architectural technique to mitigate the effects of large
memory latency in multiprocessors. We examine two schemes for
implementing multiple-context processors. The first scheme
switches between contexts only on a cache miss, while the
other interleaves the contexts on a cycle-by-cycle basis.
Both schemes provide the capability for a single context to
fully utilize the pipeline. We show that cycle-by-cycle
interleaving of contexts provides a performance advantage
over switching contexts only at a cache miss. This advantage
results from the context interleaving hiding pipeline
dependencies and reducing the context switch cost. In
addition, we show that while the implementation of the
interleaved scheme is more complex, the complexity is not
overwhelming. As pipelines get deeper and operate at lower
percentages of peak performance, the performance advantage of
the interleaved scheme is likely to justify its additional
complexity.
http://i.stanford.edu/pub/cstr/reports/csl/tr/92/523/CSL-TR-92-523.pdf