... | ... | @@ -9,9 +9,11 @@ |
|
|
The input file for the events on Intel® Haswell EP/EN/EX can be found [here](https://github.com/rrze-likwid/likwid/blob/master/src/includes/perfmon_haswellEP_events.txt).
|
|
|
|
|
|
## Counters
|
|
|
- [Core-local counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#core-local-counters)
|
|
|
- [Fixed-purpose counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#fixed-purpose-counters)
|
|
|
- [General-purpose counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#general-purpose-counters)
|
|
|
- [Thermal counter](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#thermal-counter)
|
|
|
- [Socket-wide counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#socket-wide-counters)
|
|
|
- [Power counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#power-counters)
|
|
|
- [Home Agent counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#home-agent-counters)
|
|
|
- [LLC-to-QPI interface fixed-purpose counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#llc-to-qpi-interface-fixed-purpose-counters)
|
... | ... | @@ -27,9 +29,11 @@ The input file for the events on Intel® Haswell EP/EN/EX can be found [here] |
|
|
- [Ring-to-PCIe counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#ring-to-pcie-counters)
|
|
|
- [IRP box counters](https://github.com/rrze-likwid/likwid/wiki/Haswell-EP#irp-box-counters)
|
|
|
|
|
|
### Fixed-purpose counters
|
|
|
Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose counters. Each can measure only one specific event. They are core-local, hence each hardware thread has its own set of fixed counters.
|
|
|
#### Counters
|
|
|
|
|
|
### Core-local counters
|
|
|
#### Fixed-purpose counters
|
|
|
Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose counters. Each can measure only one specific event.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -49,7 +53,7 @@ Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose co |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -71,9 +75,9 @@ Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose co |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### General-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides 4 general-purpose counters consisting of a config and a counter register. They are core-local.
|
|
|
#### Counters
|
|
|
#### General-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides 4 general-purpose counters consisting of a config and a counter register.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -97,7 +101,7 @@ The Intel® Haswell EP/EN/EX microarchitecture provides 4 general-purpose cou |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -149,7 +153,7 @@ The Intel® Haswell EP/EN/EX microarchitecture provides 4 general-purpose cou |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Special handling for events
|
|
|
##### Special handling for events
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measureing of offcore events in PMC counters. Therefore the stream of offcore events must be filtered using the OFFCORE_RESPONSE registers. The Intel® Haswell microarchitecture has two of those registers. LIKWID defines some events that perform the filtering according to the event name. Although there are many bitmasks possible, LIKWID natively provides only the ones with response type ANY. Own filtering can be applied with the OFFCORE_RESPONSE_0_OPTIONS and OFFCORE_RESPONSE_1_OPTIONS events. Only for those events two more counter options are available:
|
|
|
<TABLE>
|
|
|
<TR>
|
... | ... | @@ -162,20 +166,20 @@ The Intel® Haswell EP/EN/EX microarchitecture provides measureing of offcore |
|
|
<TD>match0</TD>
|
|
|
<TD>16 bit hex value</TD>
|
|
|
<TD>Input value masked with 0x8FFF and written to bits 0-15 in the OFFCORE_RESPONSE register</TD>
|
|
|
<TD>Check the <A HREF="http://www.Intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html">Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring</A> and the event files at <A HREF="https://download.01.org/perfmon/SLM">https://download.01.org/perfmon/HSX</A>.</TD>
|
|
|
<TD>Check the <A HREF="http://www.Intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html">Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring</A> and <A HREF="https://download.01.org/perfmon/SLM">https://download.01.org/perfmon/HSX</A>.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>match1</TD>
|
|
|
<TD>22 bit hex value</TD>
|
|
|
<TD>Input value is written to bits 16-37 in the OFFCORE_RESPONSE register</TD>
|
|
|
<TD>Check the <A HREF="http://www.Intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html">Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring</A> and the event files at <A HREF="https://download.01.org/perfmon/SLM">https://download.01.org/perfmon/HSX</A>.</TD>
|
|
|
<TD>Check the <A HREF="http://www.Intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html">Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring</A> and <A HREF="https://download.01.org/perfmon/SLM">https://download.01.org/perfmon/HSX</A>.</TD>
|
|
|
</TR>
|
|
|
</TABLE>
|
|
|
The event MEM_TRANS_RETIRED_LOAD_LAT is not available because it needs programming of PEBS registers. PEBS is a kernel-level measurement facility. Although we can programm it from user-space, the results are always 0.
|
|
|
The event MEM_TRANS_RETIRED_LOAD_LATENCY is not available because it needs programming of PEBS registers. PEBS is a kernel-level measurement facility. Although we can programm it from user-space, the results are always 0.
|
|
|
|
|
|
### Thermal counter
|
|
|
#### Thermal counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides one register for the current core temperature.
|
|
|
#### Counters
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -187,9 +191,11 @@ The Intel® Haswell EP/EN/EX microarchitecture provides one register for the |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Power counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the current power consumption through the RAPL interface. The RAPL counters are available for one hardware thread per CPU socket.
|
|
|
#### Counters
|
|
|
### Socket-wide counters
|
|
|
|
|
|
#### Power counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the current power consumption through the RAPL interface.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -213,11 +219,12 @@ The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the c |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Home Agent counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the Home Agent (HA) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The HA is responsible for the protocol side of memory interactions, including coherent and non-coherent home agent protocols (as defined in the Intel®® QuickPath Interconnect Specification). Additionally, the HA is responsible for ordering memory reads/writes, coming in from the modular Ring, to a given address such that the iMC (memory controller).</I><BR>
|
|
|
The HA hardware performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the HA. For systems where each socket has 12 or more cores, there are both HAs available. The name BBOX originates from the Nehalem EX Uncore monitoring where this functional unit is called BBOX.
|
|
|
#### Counters
|
|
|
#### Home Agent counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the Home Agent (HA) in the uncore. The description from Intel®:<BR>
|
|
|
<I>Each HA is responsible for the protocol side of memory interactions, including coherent and non-coherent home agent protocols (as defined in the Intel® QuickPath Interconnect Specification). Additionally, the HA is responsible for ordering memory reads/writes, coming in from the modular Ring, to a given address such that the IMC (memory controller).
|
|
|
</I><BR>
|
|
|
The Home Agent performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the HA. For systems where each socket has 12 or more cores, there are both HAs available. The name BBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -241,7 +248,7 @@ The HA hardware performance counters are exposed to the operating system through |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -281,13 +288,13 @@ The HA hardware performance counters are exposed to the operating system through |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Ring-to-Ring interface counters
|
|
|
#### Ring-to-Ring interface counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture manages the socket internal traffic through ring-based networks. Depending on the system's configuration there are multiple rings in one socket. The SBOXes organizes the traffic between the rings. The description from Intel®:<BR>
|
|
|
<I>The SBox manages the interface between the two Rings.<BR>
|
|
|
The processor is composed of two independent rings connected via two sets of bi-directional buffered switches. Each set of bi-directional buffered switches is partitioned into two ingress/egress pairs. Further, each ingress/egress pair is associated with a ring stop on adjacent rings. This ring stop is termed an Sbo. The processor has up to 4 SBos depending on SKU. The Sbo can be simply thought of as a conduit for the ring, but must also help maintain ordering of traffic to ensure functional correctness in certain cases.
|
|
|
</I><BR>
|
|
|
The SBOX hardware performance counters are exposed to the operating system through the MSR interface. There are maximal four of those interfaces but not all must be present. The name SBOX originates from the Nehalem EX Uncore monitoring where the functional unit to the QPI network is called SBOX but it had a different duty.
|
|
|
#### Counters
|
|
|
The SBOX hardware performance counters are exposed to the operating system through the MSR interface. There are maximal four of those interfaces but not all must be present. The name SBOX originates from the Nehalem EX uncore monitoring where the functional unit to the QPI network is called SBOX but it had a different duty.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -311,7 +318,7 @@ The SBOX hardware performance counters are exposed to the operating system throu |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -345,12 +352,12 @@ The SBOX hardware performance counters are exposed to the operating system throu |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### QPI interface fixed-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the QPI Link layer (QPI) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The Intel® QPI Link Layer is responsible for packetizing requests from the caching agent on the way out to the system interface. As such, it shares responsibility with the CBo(s) as the Intel® QPI caching agent(s). It is responsible for converting CBo requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa.On Ivy Bridge, Intel® QPI is split into two separate layers. The Intel® QPI LL (link layer) is responsible for generating, transmitting, and receiving packets with the Intel® QPI link.
|
|
|
#### QPI interface fixed-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the QPI Link layer (QPI) in the uncore. The description from Intel®:<BR>
|
|
|
<I>The Intel® QPI Link Layer is responsible for packetizing requests from the caching agent on the way out to the system interface. As such, it shares responsibility with the CBo(s) as the Intel® QPI caching agent(s). It is responsible for converting CBo requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa. On Intel® Xeon processor E5 v3 family, Intel® QPI is split into two separate layers. The Intel® QPI LL (link layer) is responsible for generating, transmitting, and receiving packets with the Intel® QPI link.
|
|
|
</I><BR>
|
|
|
The QPI hardware performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the QPI. The actual amount of QBOX counters depend on the CPU core count of one socket. If your system has not all interfaces but interface 0 does not work, try the other ones. The QBOX was introduced for the Haswell EP microarchitecture, for older Uncore-aware architectures the QBOX and the SBOX are the same.
|
|
|
#### Counters
|
|
|
The QPI hardware performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the QPI. The actual amount of QBOX counters depend on the CPU core count of one socket. If your system has not all interfaces but interface 0 does not work, try the other ones. The QBOX was introduced for the Haswell EP microarchitecture, for older uncore-aware architectures the QBOX and the SBOX are the same.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -370,12 +377,12 @@ The QPI hardware performance counters are exposed to the operating system throug |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### QPI interface general-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the QPI Link layer (QPI) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The Intel® QPI Link Layer is responsible for packetizing requests from the caching agent on the way out to the system interface. As such, it shares responsibility with the CBo(s) as the Intel® QPI caching agent(s). It is responsible for converting CBo requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa.On Ivy Bridge, Intel® QPI is split into two separate layers. The Intel® QPI LL (link layer) is responsible for generating, transmitting, and receiving packets with the Intel® QPI link.
|
|
|
#### QPI interface general-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the QPI Link layer (QPI) in the uncore. The description from Intel®:<BR>
|
|
|
<I>The Intel® QPI Link Layer is responsible for packetizing requests from the caching agent on the way out to the system interface. As such, it shares responsibility with the CBo(s) as the Intel® QPI caching agent(s). It is responsible for converting CBo requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa. On Intel® Xeon processor E5 v3 family, Intel® QPI is split into two separate layers. The Intel® QPI LL (link layer) is responsible for generating, transmitting, and receiving packets with the Intel® QPI link.
|
|
|
</I><BR>
|
|
|
The QPI hardware performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the QPI. The actual amount of QBOX counters depend on the CPU core count of one socket. If your system has not all interfaces but interface 0 does not work, try the other ones. The QBOX was introduced for the Haswell EP microarchitecture, for older Uncore-aware architectures the QBOX and the SBOX are the same.
|
|
|
#### Counters
|
|
|
The QPI hardware performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the QPI. The actual amount of QBOX counters depend on the CPU core count of one socket. If your system has not all interfaces but interface 0 does not work, try the other ones. The QBOX was introduced for the Haswell EP microarchitecture, for older uncore-aware architectures the QBOX and the SBOX are the same.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -399,7 +406,7 @@ The QPI hardware performance counters are exposed to the operating system throug |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -423,58 +430,59 @@ The QPI hardware performance counters are exposed to the operating system throug |
|
|
<TD>match0</TD>
|
|
|
<TD>32 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_RX_MATCH_0 register of PCI device</TD>
|
|
|
<TD>This option matches the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option matches the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>match1</TD>
|
|
|
<TD>20 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_RX_MATCH_1 register of PCI device</TD>
|
|
|
<TD>This option matches the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option matches the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>match2</TD>
|
|
|
<TD>32 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_TX_MATCH_0 register of PCI device</TD>
|
|
|
<TD>This option matches the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option matches the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>match3</TD>
|
|
|
<TD>20 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_TX_MATCH_1 register of PCI device</TD>
|
|
|
<TD>This option matches the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option matches the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>mask0</TD>
|
|
|
<TD>32 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_RX_MASK_0 register of PCI device</TD>
|
|
|
<TD>This option masks the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option masks the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>mask1</TD>
|
|
|
<TD>20 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_RX_MASK_1 register of PCI device</TD>
|
|
|
<TD>This option masks the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option masks the receive side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>mask2</TD>
|
|
|
<TD>32 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_TX_MASK_0 register of PCI device</TD>
|
|
|
<TD>This option masks the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option masks the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>mask3</TD>
|
|
|
<TD>20 bit hex address</TD>
|
|
|
<TD>Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_TX_MASK_1 register of PCI device</TD>
|
|
|
<TD>This option masks the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for bit fields.</TD>
|
|
|
<TD>This option masks the transmit side. Check <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for bit fields.</TD>
|
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Last Level cache counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the LLC coherency engine in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The LLC coherence engine (CBo) manages the interface between the core and the last level cache (LLC). All core transactions that access the LLC are directed from the core to a CBo via the ring interconnect. The CBo is responsible for managing data delivery from the LLC to the requesting core. It is also responsible for maintaining coherence between the cores within the socket that share the LLC; generating snoops and collecting snoop responses from the local cores when the MESIF protocol requires it.
|
|
|
#### Last Level cache counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the LLC coherency engine in the uncore. The description from Intel®:<BR>
|
|
|
<I>The LLC coherence engine (CBo) manages the interface between the core and the last level cache (LLC). All core transactions that access the LLC are directed from the core to a CBo via the ring interconnect. The CBo is responsible for managing data delivery
|
|
|
from the LLC to the requesting core. It is also responsible for maintaining coherence between the cores within the socket that share the LLC; generating snoops and collecting snoop responses from the local cores when the MESIF protocol requires it.
|
|
|
</I><BR>
|
|
|
The LLC hardware performance counters are exposed to the operating system through the MSR interface. The maximal amount of supported coherency engines for the Intel® Haswell EP/EN/EX microarchitecture is 17. E7-8800 v2 systems have all 17 engines, the E5-2600 v2 only 10 of them and the E5-1600 v2 only 6. It may be possible that your systems does not have all CBOXes, LIKWID will skip the unavailable ones in the setup phase. The name CBOX originates from the Nehalem EX Uncore monitoring where those functional units are called CBOX.
|
|
|
#### Counters
|
|
|
The LLC hardware performance counters are exposed to the operating system through the MSR interface. The maximal amount of supported coherency engines for the Intel® Haswell EP/EN/EX microarchitecture is 17. E7-8800 v2 systems have all 17 engines, the E5-2600 v2 only 10 of them and the E5-1600 v2 only 6. It may be possible that your systems does not have all CBOXes, LIKWID will skip the unavailable ones in the setup phase. The name CBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -498,7 +506,7 @@ The LLC hardware performance counters are exposed to the operating system throug |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -540,26 +548,32 @@ The LLC hardware performance counters are exposed to the operating system throug |
|
|
<TD>opcode</TD>
|
|
|
<TD>9 bit hex value</TD>
|
|
|
<TD>Set bits 20-28 in MSR_UNC_C<0-17>_PMON_BOX_FILTER1 register</TD>
|
|
|
<TD>A list of valid opcodes can be found in the <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A>.</TD>
|
|
|
<TD>A list of valid opcodes can be found in the <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A>.</TD>
|
|
|
</TR>
|
|
|
<TR>
|
|
|
<TD>match0</TD>
|
|
|
<TD>2 bit hex address</TD>
|
|
|
<TD>Set bits 30-31 in MSR_UNC_C<0-17>_PMON_BOX_FILTER1 register</TD>
|
|
|
<TD>See the <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 Uncore Manual</A> for more information.</TD>
|
|
|
<TD>See the <A HREF="http://www.Intel.de/content/www/de/de/processors/xeon/xeon-e5-2600-v2-uncore-manual.html">Intel® Xeon E5-2600 v3 uncore Manual</A> for more information.</TD>
|
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Special handling for events
|
|
|
##### Special handling for events
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides an event LLC_LOOKUP which can be filtered with the 'state' option. If no 'state' is set, LIKWID sets the state to 0x1F, the default value to measure all lookups.
|
|
|
|
|
|
### Uncore management fixed-purpose counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the management box in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The UBox serves as the system configuration controller within the physical processor.
|
|
|
</I><BR>
|
|
|
The single fixed-purpose counter counts the clock frequency of the clock source of the Uncore.
|
|
|
The Uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX Uncore monitoring where those functional units are called UBOX.
|
|
|
#### Counter
|
|
|
#### Uncore management fixed-purpose counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the management box in the uncore. The description from Intel®:<BR>
|
|
|
<I>The UBox serves as the system configuration controller for the Intel Xeon processor E5
|
|
|
v3 family.
|
|
|
In this capacity, the UBox acts as the central unit for a variety of functions:</I>
|
|
|
- <I>The master for reading and writing physically distributed registers across Intel® Xeon processor E5 v3 family using the Message Channel.</I>
|
|
|
- <I>The UBox is the intermediary for interrupt traffic, receiving interrupts from the system and dispatching interrupts to the appropriate core.</I>
|
|
|
- <I>The UBox serves as the system lock master used when quiescing the platform (e.g., Intel® QPI bus lock).</I>
|
|
|
|
|
|
|
|
|
The single fixed-purpose counter counts the clock frequency of the clock source of the uncore.
|
|
|
The uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counter
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -572,11 +586,16 @@ The Uncore management performance counters are exposed to the operating system t |
|
|
</TABLE>
|
|
|
|
|
|
### Uncore management general-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the management box in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The UBox serves as the system configuration controller within the physical processor.
|
|
|
</I><BR>
|
|
|
The Uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX Uncore monitoring where those functional units are called UBOX.
|
|
|
#### Counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the management box in the uncore. The description from Intel®:<BR>
|
|
|
The UBox serves as the system configuration controller for the Intel Xeon processor E5
|
|
|
v3 family.
|
|
|
In this capacity, the UBox acts as the central unit for a variety of functions:</I>
|
|
|
- <I>The master for reading and writing physically distributed registers across Intel® Xeon processor E5 v3 family using the Message Channel.</I>
|
|
|
- <I>The UBox is the intermediary for interrupt traffic, receiving interrupts from the system and dispatching interrupts to the appropriate core.</I>
|
|
|
- <I>The UBox serves as the system lock master used when quiescing the platform (e.g., Intel® QPI bus lock).</I>
|
|
|
|
|
|
The uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counter
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -591,7 +610,7 @@ The Uncore management performance counters are exposed to the operating system t |
|
|
<TD>*</TD>
|
|
|
</TR>
|
|
|
</TABLE>
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -613,14 +632,14 @@ The Uncore management performance counters are exposed to the operating system t |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Power control unit fixed-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the power control unit (PCU) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The PCU is the primary Power Controller for the physical processor package.
|
|
|
The uncore implements a power control unit acting as a core/uncore power and thermal manager. It runs its firmware on an internal micro-controller and coordinates the socket’s power states.
|
|
|
</I><BR>
|
|
|
The Power control unit offers two fixed-purpose counters to retrieve the cycles CPU cores stay in state C6 and C3.
|
|
|
The Uncore management performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX Uncore monitoring where those functional units are called WBOX.
|
|
|
#### Counters
|
|
|
#### Power control unit fixed-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the power control unit (PCU) in the uncore. The description from Intel®:<BR>
|
|
|
<I>The PCU is the primary Power Controller for the Intel® Xeon processor E5 v3 family. Intel® Xeon processor E5 v3 family uncore implements a power control unit acting as a core/uncore power and thermal manager. It runs its firmware on an internal micro-controller and coordinates the socket’s power states.
|
|
|
</I>
|
|
|
|
|
|
The PCU offers two fixed-purpose counters to retrieve the cycles CPU cores stay in state C6 and C3.
|
|
|
The uncore management performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -636,13 +655,13 @@ The Uncore management performance counters are exposed to the operating system t |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Power control unit general-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the power control unit (PCU) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The PCU is the primary Power Controller for the physical processor package.
|
|
|
The uncore implements a power control unit acting as a core/uncore power and thermal manager. It runs its firmware on an internal micro-controller and coordinates the socket’s power states.
|
|
|
</I><BR>
|
|
|
The Uncore management performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX Uncore monitoring where those functional units are called WBOX.
|
|
|
#### Counters
|
|
|
#### Power control unit general-purpose counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the power control unit (PCU) in the uncore. The description from Intel®:<BR>
|
|
|
<I>The PCU is the primary Power Controller for the Intel® Xeon processor E5 v3 family. Intel® Xeon processor E5 v3 family uncore implements a power control unit acting as a core/uncore power and thermal manager. It runs its firmware on an internal micro-controller and coordinates the socket’s power states.
|
|
|
</I>
|
|
|
|
|
|
The PCU performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -666,7 +685,7 @@ The Uncore management performance counters are exposed to the operating system t |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -712,13 +731,14 @@ The Uncore management performance counters are exposed to the operating system t |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Memory controller fixed-purpose counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The integrated Memory Controller provides the interface to DRAM and communicates to the rest of the uncore through the Home Agent (i.e. the iMC does not connect to the Ring).<BR>
|
|
|
#### Memory controller fixed-purpose counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the uncore. The description from Intel®:<BR>
|
|
|
<I>The Intel® Xeon processor E5 v3 family integrated Memory Controller provides the interface to DRAM and communicates to the rest of the uncore through the Home Agent (i.e. the IMC does not connect to the Ring).<BR>
|
|
|
In conjunction with the HA, the memory controller also provides a variety of RAS features, such as ECC, lockstep, memory access retry, memory scrubbing, thermal throttling, mirroring, and rank sparing.
|
|
|
</I><BR>
|
|
|
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system (E7-8800 v2). There are four different PCI devices per memory controller, each covering one memory channel. Each channel has one fixed counter for the DRAM clock. The four channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX Uncore monitoring where those functional units are called MBOX.
|
|
|
#### Counters
|
|
|
</I>
|
|
|
|
|
|
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system (E7-8800 v2). There are four different PCI devices per memory controller, each covering one memory channel. Each channel has one fixed counter for the DRAM clock. The four channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -731,13 +751,14 @@ The integrated Memory Controllers performance counters are exposed to the operat |
|
|
</TABLE>
|
|
|
|
|
|
|
|
|
### Memory controller general-purpose counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the Uncore. The description from Intel®:<BR>
|
|
|
<I>The integrated Memory Controller provides the interface to DRAM and communicates to the rest of the uncore through the Home Agent (i.e. the iMC does not connect to the Ring).<BR>
|
|
|
#### Memory controller general-purpose counter
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the uncore. The description from Intel®:<BR>
|
|
|
<I>The Intel® Xeon processor E5 v3 family integrated Memory Controller provides the interface to DRAM and communicates to the rest of the uncore through the Home Agent (i.e. the IMC does not connect to the Ring).<BR>
|
|
|
In conjunction with the HA, the memory controller also provides a variety of RAS features, such as ECC, lockstep, memory access retry, memory scrubbing, thermal throttling, mirroring, and rank sparing.
|
|
|
</I><BR>
|
|
|
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system (E7-8800 v2). There are four different PCI devices per memory controller, each covering one memory channel. Each channel has four different general-purpose counters. The four channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX Uncore monitoring where those functional units are called MBOX.
|
|
|
#### Counters
|
|
|
</I>
|
|
|
|
|
|
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system (E7-8800 v2). There are four different PCI devices per memory controller, each covering one memory channel. Each channel has four different general-purpose counters. The four channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -761,7 +782,7 @@ The integrated Memory Controllers performance counters are exposed to the operat |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -783,13 +804,14 @@ The integrated Memory Controllers performance counters are exposed to the operat |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Ring-to-QPI counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the Ring-to-QPI (R3QPI) interface in the Uncore. The description from Intel®:<BR>
|
|
|
#### Ring-to-QPI counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the Ring-to-QPI (R3QPI) interface in the uncore. The description from Intel®:<BR>
|
|
|
<I>R3QPI is the interface between the Intel® QPI Link Layer, which packetizes requests, and the Ring.<BR>
|
|
|
R3QPI is the interface between the ring and the Intel® QPI Link Layer. It is responsible for translating between ring protocol packets and flits that are used for transmitting data across the Intel® QPI interface. It performs credit checking between the local Intel® QPI LL, the remote Intel® QPI LL and other agents on the local ring.
|
|
|
</I><BR>
|
|
|
The Ring-to-QPI performance counters are exposed to the operating system through PCI interfaces. Since the RBOXes manage the traffic from the LLC-connecting ring interface on the socket with the QPI interfaces (SBOXes), the amount is similar to the amount of SBOXes. See at SBOXes how many are available for which system configuration. The name RBOX originates from the Nehalem EX Uncore monitoring where those functional units are called RBOX.
|
|
|
#### Counters
|
|
|
</I>
|
|
|
|
|
|
The Ring-to-QPI performance counters are exposed to the operating system through PCI interfaces. Since the RBOXes manage the traffic from the LLC-connecting ring interface on the socket with the QPI interfaces (SBOXes), the amount is similar to the amount of SBOXes. See at SBOXes how many are available for which system configuration. The name RBOX originates from the Nehalem EX uncore monitoring.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -809,7 +831,7 @@ The Ring-to-QPI performance counters are exposed to the operating system through |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -831,12 +853,13 @@ The Ring-to-QPI performance counters are exposed to the operating system through |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
### Ring-to-PCIe counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the Ring-to-PCIe (R2PCIe) interface in the Uncore. The description from Intel®:<BR>
|
|
|
#### Ring-to-PCIe counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the Ring-to-PCIe (R2PCIe) interface in the uncore. The description from Intel®:<BR>
|
|
|
<I>R2PCIe represents the interface between the Ring and IIO traffic to/from PCIe.
|
|
|
</I><BR>
|
|
|
</I>
|
|
|
|
|
|
The Ring-to-PCIe performance counters are exposed to the operating system through a PCI interface. Independent of the system's configuration, there is only one Ring-to-PCIe interface per CPU socket.
|
|
|
#### Counters
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -860,7 +883,7 @@ The Ring-to-PCIe performance counters are exposed to the operating system throug |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | @@ -883,11 +906,12 @@ The Ring-to-PCIe performance counters are exposed to the operating system throug |
|
|
</TABLE>
|
|
|
|
|
|
#### IRP box counters
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the IRP box in the Uncore. The description from Intel®:<BR>
|
|
|
The Intel® Haswell EP/EN/EX microarchitecture provides measurements of the IRP box in the uncore. The description from Intel®:<BR>
|
|
|
<I>IRP is responsible for maintaining coherency for IIO traffic that needs to be coherent (e.g. cross-socket P2P).
|
|
|
</I><BR>
|
|
|
The uncore management performance counters are exposed to the operating system through the PCI interface. The IBOX was introduced with the Intel® IvyBridge EP/EN/EX microarchitecture.
|
|
|
#### Counters
|
|
|
</I>
|
|
|
|
|
|
The IRP box counters are exposed to the operating system through the PCI interface. The IBOX was introduced with the Intel® IvyBridge EP/EN/EX microarchitecture.
|
|
|
##### Counters
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Counter name</TH>
|
... | ... | @@ -903,7 +927,7 @@ The uncore management performance counters are exposed to the operating system t |
|
|
</TR>
|
|
|
</TABLE>
|
|
|
|
|
|
#### Available Options
|
|
|
##### Available Options
|
|
|
<TABLE>
|
|
|
<TR>
|
|
|
<TH>Option</TH>
|
... | ... | |