... | ... | @@ -20,6 +20,10 @@ The exact heuristics are not published by Intel. |
|
|
<img width="49%" src="https://raw.githubusercontent.com/wiki/RRZE-HPC/likwid/images/skx_caches/cache_layers_skx.png" alt="Cache layers of Intel Skylake SP processors">
|
|
|
</p>
|
|
|
|
|
|
<p align="center">
|
|
|
<img width="49%" src="https://raw.githubusercontent.com/wiki/RRZE-HPC/likwid/images/skx_caches/overview.png" alt="Cache events on data-paths for Skylake SP processors">
|
|
|
</p>
|
|
|
|
|
|
(**) Except the LLC prefetcher is active and pulls some cache lines from memory. Which it is on the Intel Skylake SP test system. But as we see later, the currently known events are not able to differentiate the L2 load traffic between L2 and either L3 or memory. The prefetcher accelerates the loading of data for streaming access, so we probably measure a higher load bandwidth due to the prefetcher but the analysis is based on data volume per iteration leaving out the factor time.
|
|
|
|
|
|
## What is the difference for measurements?
|
... | ... | |