Report Number: CS-TR-83-994
Institution: Stanford University, Department of Computer Science
Title: Sorting by recursive partitioning
Author: Chapiro, Daniel M.
Date: December 1983
Abstract: We present a new O(n lg lg n) time sort algorithm that is
more robust than O(n) distribution sorting algorithms. The
algorithm uses a recursive partition-concatenate approach,
partitioning each set into a variable number of subsets using
information gathered dynamically during execution. Sequences
are partitioned using statistical information computed during
the sort for each sequence.
Space complexity is O(n) and is independent from the order
and distributlon of the data. lf the data is originally in a
list, only O($\sqrt{n}$) extra space is necessary.
The algorithm is insensitive to the initial ordering of the
data, and it is much less sensitive to the distribution of
the values of the sorting keys than distribution sorting
algorithms. Its worst-case time is O(n lg lg n) across all
distributions that satisfy a new "fractalness" criterion.
This condition, which is sufficient but not necessary, is
satisfied by any set with bounded length keys and bounded
repetition of each key.
If this condition is not satisfied, its worst case
performance degrades gracefully to O(n lg n). In practice,
this occurs when the density of the distribution over
$\Omega$(n) of the keys is a fractal curve (for sets of
numbers whose values are bounded), or when the distribution
has very heavy tails with arbitrarily long keys (for sets of
numbers whose precision is bounded).
In some preliminary tests, it was faster than Quicksort for
sets of more than 150 elements. The algorithm is practical,
works basically "in place", can be easily implemented and is
particularly well suited both for parallel processing and for
external sorting.
http://i.stanford.edu/pub/cstr/reports/cs/tr/83/994/CS-TR-83-994.pdf