BIB-VERSION:: CS-TR-v2.0 ID:: STAN//CS-TR-90-1305 ENTRY:: September 06, 1994 ORGANIZATION:: Stanford University, Department of Computer Science TITLE:: A comparative evaluation of nodal and supernodal parallel sparse matrix factorization: detailed simulation results TYPE:: Technical Report AUTHOR:: Rathberg, Edward AUTHOR:: Gupta, Anoop DATE:: February 1990 PAGES:: 29 ABSTRACT:: In this paper we consider the problem of factoring a large sparse system of equations on a modestly parallel shared-memory multiprocessor with a non-trivial memory hierarchy. Using detailed multiprocessor simulation, we study the behavior of the parallel sparse factorization scheme developed at the Oak Ridge National Laboratory. We then extend the Oak Ridge scheme to incorporate the notion of supernodal elimination. We present detailed analyses of the sources of performance degradation for each of these schemes. We measure the impact of interprocessor communication costs, processor load imbalance, overheads introduced in order to distribute work, and cache behavior on overall parallel performance. For the three benchmark matrices which we study, we find that the supernodal scheme gives a factor of 1.7 to 2.7 performance advantage for 8 processors and a factor of 0.9 to 1.6 for 32 processors. The supemodal scheme exhibits higher performance due mainly to the fact that it executes many fewer memory operations and produces fewer cache misses. However, the natural task grain size for the supernodal scheme is much larger than that of the Oak Ridge scheme, making effective distnbution of work more difficult, especially when the number of processors is large. NOTES:: [Adminitrivia V1/RAM/19940906] END:: STAN//CS-TR-90-1305