Computer Architecture Paper Summaries (CAPS): Memory wall

Paper: Wulf, Wm & McKee, Sally. (1996). Hitting the Memory Wall: Implications of the Obvious. Computer Architecture News. 23.

This 1996's paper talks about the then-impending era where an average cache miss would take more time to get resolved than the time taken by the next few instructions in line (i.e. before hitting the next memory miss) to complete execution, in which case the processor would essentially be waiting, unable to proceed without the earlier miss-resolution -- basically weighed down by the slow memory.

Important assumptions:

Every 5th instruction would access memory (fair, but this really wouldn't matter to make the point)

And, 1% of those instructions would miss in the cache, and thus request a refill (matters because a perfect cache wouldn't cause this processor stalling)

Processors get faster by 100% in relation to the memories getting faster by 7% (again, doesn't really matter as long as there's still a mismatch to prove the point)

This was because the processors (then, as well as now) are getting faster at a rate that is more than that of memories (DRAMs) getting faster like in the 2nd assumption. In hindsight, obviously, processors would become so much faster than memories that the latter becomes a significant bottleneck -- the ramifications well notable today in the form of the dearth of memory bandwidth across multiple cores or heterogeneous compute elements; and also major advancements that resulted like processing in-memory, NUMA, etc. Perhaps we could even connect the motivations of many other advancements like cache pre-fetching, speculative execution, etc. to this aptly termed "Memory Wall".

Computer Architecture Paper Summaries (CAPS)

Search for keywords or author names

Sunday, 2 May 2021

Memory wall

No comments:

Post a Comment