[BUG] hwloc causing Segmentation fault
Created by: DavVad
Describe the bug
Trying to use LIKWID 5.2.0 on our HPC cluster. likwid-perfctr
command falls with segmentation fault at ./src/pci_hwloc.c:81.
We get the following output:
# ./likwid-perfctr -g BRANCH sleep 1
--------------------------------------------------------------------------------
CPU name: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
CPU type: Intel Xeon Haswell EN/EP/EX processor
CPU clock: 2.60 GHz
Segmentation fault
The same output using -V 3 option: https://pastebin.com/snEPXWpv
Here is gdb
output:
Program received signal SIGSEGV, Segmentation fault. 0x00007ffff5fe2fda in hwloc_pci_init (testDevice=12080, socket_bus=0x7ffff70ad240 <socket_bus>, nrSockets=0x7ffff70ad224 <nr_sockets>) at ./src/pci_hwloc.c:81 81 while (walk->type != HWLOC_OBJ_SOCKET) walk = walk->parent;
And here is full output of gdb as well as info stack
command in gdb: https://pastebin.com/nwHT06XY
It seems that the reason is the NULL pointer at this point, since p walk
in gdb shows $1 = (likwid_hwloc_obj_t) 0x0
.
To Reproduce
Configure LIKWID for direct access. Run ./likwid-perfctr -g BRANCH sleep 1
command (reproduces with other event sets as well).
LIKWID version: 5.2.0 (downloaded from http://ftp.fau.de/pub/likwid/). OS: CentOS Linux 7 (kernel 3.10.0)
Additional context Everything worked fine when using access daemon. Everything worked fine when using direct access with older LIKWID version (4.3.4).