Report Number: CSL-TR-92-523
Institution: Stanford University, Computer Systems Laboratory
Title: Architectural and implementation tradeoffs in the design of multiple-context processors
Author: Laudon, James
Author: Gupta, Anoop
Author: Horowitz, Mark
Date: May 1992
Abstract: Multiple-context processors have been proposed as an architectural technique to mitigate the effects of large memory latency in multiprocessors. We examine two schemes for implementing multiple-context processors. The first scheme switches between contexts only on a cache miss, while the other interleaves the contexts on a cycle-by-cycle basis. Both schemes provide the capability for a single context to fully utilize the pipeline. We show that cycle-by-cycle interleaving of contexts provides a performance advantage over switching contexts only at a cache miss. This advantage results from the context interleaving hiding pipeline dependencies and reducing the context switch cost. In addition, we show that while the implementation of the interleaved scheme is more complex, the complexity is not overwhelming. As pipelines get deeper and operate at lower percentages of peak performance, the performance advantage of the interleaved scheme is likely to justify its additional complexity.
http://i.stanford.edu/pub/cstr/reports/csl/tr/92/523/CSL-TR-92-523.pdf