Report Number: CSL-TR-90-413
Institution: Stanford University, Computer Systems Laboratory
Title: An area-utility model for on-chip memories and its application
Author: Mulder, Johannes M.
Author: Quach, Nhon T.
Author: Flynn, Michael J.
Date: February 1990
Abstract: Utility can be defined as quality per unit of cost. The
utility of a particular function in a microprocessor can be
defined as its contribution to the overall processor
performance per unit of implementation cost. In the case of
on-chip data memory (e.g., registers, caches) the performance
contribution can be reduced to its effectiveness in reducing
memory traffic or in reducing the average time to fetch
operands. An important cost measure for on-chip memory is
occupied area. On-chip memory performance, however, is
expressed much more easily as a function of size (the storage
capacity) than as a function of area.
Simple models have been proposed for mapping memory size to
occupied area. These models, however, are of unproven
validity and only apply when comparing relatively large
buffers (³ 128 words for caches, ³ 32 words for register
sets) of the same structure (e.g., cache versus cache). In
this paper we present an area model for on-chip memories. The
area model considers the supplied bandwidth of the individual
memory cells and includes such overhead as control logic,
driver logic, and tag storage, thereby permitting comparison
of data buffers of different organizations and of arbitrary
sizes. The model gave less than 10% error when verified
against real caches and register files.
Using this area-utility measure F(Performance,Area), we
first investigated the performance of various cache
organizations and then compared the performance of register
buffers (e.g., register sets, multiple overlapping sets) and
on-chip caches. Comparing cache performance as a function of
area, rather than size, leads to a significantly different
set of organizational tradeoffs. Caches occupy more area per
bit than register buffers for sizes of 128 words or less. For
data caches, line size is a primary determinant of
performance for small sizes while write policy becomes the
primary factor for larger caches. For the same area, multiple
register sets have poorer performance than a single register
set with cache except when the memory access time is very
fast (under 3 processor cycles).
http://i.stanford.edu/pub/cstr/reports/csl/tr/90/413/CSL-TR-90-413.pdf