BIB-VERSION:: CS-TR-v2.0
          ID:: STAN//CS-TR-90-1318
       ENTRY:: September 14, 1994
ORGANIZATION:: Stanford University, Department of Computer Science
       TITLE:: Techniques for improving the performance of sparse matrix
               factorization on multiprocessor workstations
        TYPE:: Technical Report
      AUTHOR:: Rothberg, Edward
      AUTHOR:: Gupta, Anoop
        DATE:: June 1990
       PAGES:: 14
    ABSTRACT:: In this paper we look at the problem of factoring large
               sparse systems of equations on high-performance
               multiprocessor workstations. While these multiprocessor
               workstations are capable of very high peak floating point
               computation rates, most existing sparse factorization codes
               achieve only a small fraction of this potential. A major
               limiting factor is the cost of memory accesses performed
               during the factorization. ln this paper, we describe a
               parallel factorization code which utilizes the supernodal
               structure of the matrix to reduce the number of memory
               references. We also propose enhancements that significantly
               reduce the overall cache miss rate. The result is greatly
               increased factorization performance. We present experimental
               results from executions of our codes on the Silicon Graphics
               4D/380 multiprocessor. Using eight processors, we find that
               the supernodal parallel code achieves a computation rate of
               approximately 40 MFLOPS when factoring a range of benchmark
               matrices. This is more than twice as fast as the parallel
               nodal code developed at the Oak Ridge National Laboratory
               running on the SGI 4D/380.
       NOTES:: [Adminitrivia V1/RAM/19940914]
         END:: STAN//CS-TR-90-1318