Report Number: CSL-TR-97-719
Institution: Stanford University, Computer Systems Laboratory
Title: Automatic Computation and Data Decomposition for
Multiprocessors
Author: Anderson, Jennifer-Ann Monique
Date: March 1997
Abstract: Memory subsystem efficiency is critical to achieving high
performance on parallel machines. The memory subsystem
organization of modern multiprocessor architectures makes
their performance highly sensitive to both the distribution
of the computation and the layout of the data. A key issue in
programming these machines is selecting the computation and
data decomposition, the mapping of the computation and data,
respectively, across the processors of the machine.
A popular approach to the decomposition problem is to require
programmers to perform the decomposition analysis themselves,
and to communicate that information to the compiler using
language extensions. This thesis presents a new compiler
algorithm that automatically calculates computation and data
decompositions for dense-matrix scientific codes. The core of
the algorithm is based on a linear algebra framework for
expressing and calculating decompositions. Since the best
decompositions may change as different phases of the program
are executed, the algorithm also considers re-organizing the
data dynamically. The analysis is performed both within and
across procedure boundaries so that entire programs can be
analyzed.
We evaluated the effectiveness of the algorithm by applying
it to a suite of benchmark programs. We found that our
decomposition analysis and optimization can lead to
significant increases in program performance.
http://i.stanford.edu/pub/cstr/reports/csl/tr/97/719/CSL-TR-97-719.pdf