Gabriel Marin and John Mellor-Crummey (2005)
Scalable Cross-Architecture Predictions of Memory Hierarchy Response for Scientific Applications
In: Proceedings of the Los Alamos Computer Science Institute Sixth Annual Symposium, Santa Fe, NM, Los Alamos National Laboratory.
The gap between processor and memory speeds has been growing with each new generation of microprocessors. As a result, memory hierarchy response has become a critical factor limiting application performance. For this reason, we have been working to model the impact of memory access latency on program performance. We build upon our prior work on constructing machine-independent characterizations of application behavior by improving instruction-level modeling of the structure and scaling of data access patterns. This paper makes two contributions. First, it describes static analysis techniques that help us build accurate reference-level characterizations of memory reuse patterns in the presence of complex interactions between loop unrolling and multi-word memory blocks. Second, it describes a strategy for combining memory hierarchy response characterizations suitable for predicting behavior for fully associative caches with a probabilistic technique that enables us to predict misses for set-associative caches. We validate our approach by comparing our predictions (at loop, routine and program level) against measurements using hardware performance counters for several benchmarks on two platforms with different memory hierarchy characteristics over a large range of problem sizes.