Report Number: CSL-TR-94-616
Institution: Stanford University, Computer Systems Laboratory
Title: Reuse of High Precision Arithmetic Hardware to Perform Multiple Low Precision Calculations
Author: Z ucker, Daniel
Author: Lee, Ruby
Date: April 1994
Abstract: Many increasingly important applications, such as video compression, graphics, or multimedia, require only low-precision arithmetic. However, because the widespread adoption of the IEEE floating point standard has led to the ubiquity of IEEE double precision hardware, this double precision hardware is frequently used to do the low precision calculations. Naturally, it seems an inefficient use of resources to use 54 bits of hardware to perform an 8 or 12 bit calculation. This paper presents a method for packing operands to perform multiple low precision arithmetic operations using regular high precision hardware. Using only source level software modification, a speedup of 15% is illustrated for the Discrete Cosine Transform. Since no machine-specific optimizations are required, this method will work on any machine that supports IEEE arithmetic. Finally, an analysis of speedup and suggestions for future work are presented.
http://i.stanford.edu/pub/cstr/reports/csl/tr/94/616/CSL-TR-94-616.pdf