Report Number: CS-TR-89-1261
Institution: Stanford University, Department of Computer Science
Title: Pipelined parallel computations, and sorting on a pipelined
hypercube.
Author: Mayr, Ernst W.
Author: Plaxton, C. Greg
Date: May 1989
Abstract: This paper brings together a number of previously known
techniques in order to obtain practical and efficient
implementations of the prefix operation for the complete
binary tree, hypercube and shuffle exchange families of
networks. For each of these networks, we also provide a
"pipelined" scheme for performing k prefix operations in O(k
+ log p) time on p processors. This implies a similar
pipelining result for the "data distribution" operation of
Ullman [16]. The data distribution primitive leads to a
simplified implementation of the optimal merging algorithm of
Varman and Doshi, which runs on a pipelined model of the
hypercube [17]. Finally, a pipelined version of the multi-way
merge sort of Nassimi and Sahni [10], running on the
pipelined hypercube model, is described. Given p processors
and n < p log p values to be sorted, the running time of the
pipelined algorithm is O(log2 p/log((p log p)/n)). Note that
for the interesting case n = p this yields a running time of
0(log2 p/log log p), which is asymptotically faster than
Batcher's bitonic sort[3].
http://i.stanford.edu/pub/cstr/reports/cs/tr/89/1261/CS-TR-89-1261.pdf