Report Number: CSL-TR-00-807
Institution: Stanford University, Computer Systems Laboratory
Author: Liao, Shih-Wei
Date: August 2000
Abstract: Shared-memory multiprocessors that use the latest microprocessors
are becoming widely used both as compute servers and as desktop computers. But
the difficulty in developing parallel software is a major obstacle to the
effective use of the multiprocessors to solve a single task. To increase the productivity of
multiprocessor programmers, we developed an interactive interprocedural
parallelizer called SUIF Explorer. Our experience with SUIF Explorer also helps
to identify missing interprocedural analyses that can significantly improve an
automatic parallelizer. As a parallel programming tool, the Explorer actively
guides the programmers in the parallelization process using a set of advanced
static and dynamic analyses and
visualization techniques. Our interprocedural program analyses provide high-
quality information that restricts the need for user assistance. The Explorer is also
the first tool to apply slicing analysis to aid the programmer in uncovering program
properties for interactive parallelization. These static and dynamic analyses
minimize the number of lines of code requiring programmer assistance to produce
parallel codes for real-world applications.
As a tool for finding missing compiler techniques, SUIF Explorer helps the
compiler researchers design the next-generation parallelizer. Our experience
with the Explorer shows that interprocedural array liveness analysis is an enabler of
several important optimizations, such as privatization and array contraction.
We developed and evaluated an efficient context-sensitive and flow-sensitive
interprocedural array liveness algorithm and integrated it into the parallelizer. We use the
liveness information to enable contraction of arrays that are not live at loop
exits, which results in a smaller memory footprint and better cache utilization. The
resulting codes run faster on both uni- and multi- processors.
Another key interprocdural analysis which we developed and evaluated is the
array reduction analysis. Our reduction algorithm extends beyond previous
approaches in its ability to locate reductions to array regions, even in the
presence of arbitrarily complex data dependences. To exploit the multiprocessors
effectively, the algorithm can locate interprocedural reductions, reduction operations that
span multiple procedures. In summary, we successfully apply the Explorer to help the
user develop parallel codes effectively and to help the compiler researcher
develop the next-generation parallelizer.
http://i.stanford.edu/pub/cstr/reports/csl/tr/00/807/CSL-TR-00-807.pdf