Dear Intel VTune Support Team,
I am learning to use vtune_profiler_2020.0.0.605129 on Arch Linux (kernel 5.3.13) and the CPU based analyses work on my machine.
But I have not managed to run per-program GPU analyses. (System wide GPU profiling seems to work)
E.g. when issuing the following command to profile the program glxspheres64
TPSS_DEBUG=1 /opt/intel/vtune_profiler_2020.0.0.605129/bin64/vtune -collect graphics-rendering -app-working-dir /usr/bin -- /usr/bin/env MESA_GLSL_CACHE_DISABLE=true /usr/bin/glxspheres64
I get the following output:
log4cplus:ERROR Unable to open file: ./tpss-2020.03.14-10h16m40s.405792.log vtune: Warning: The option to analyze all processes running on the system is enabled for this analysis type by default. vtune: Warning: Ftrace 'igfx-preempt' events cannot be collected on this platform. vtune: Warning: To enable hardware event-base sampling, VTune Profiler has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes. vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /home/christianl/intel/amplxe/projects/test/r006gr -command stop. strace: Process 405798 attached strace: Process 405798 detached strace: Process 405798 attached Polygons in scene: 62464 (61 spheres * 1024 polys/spheres) vcs/tpss2/tpss/src/tpss/runtime/linux/exe/tpss_deepbind.c:237 tpss_deepbind_notify_on_pthread_loaded: Assertion '((tpss_pthread_key_create_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_key_create)]))->trampoline)) != ((void *)0) && ((tpss_pthread_setspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_setspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getspecific_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getspecific)]))->trampoline)) != ((void *)0) && ((tpss_pthread_self_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_self)]))->trampoline)) != ((void *)0) && ((tpss_pthread_getattr_np_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_getattr_np)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_getstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_getstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstack_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstack)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_setstacksize_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_setstacksize)]))->trampoline)) != ((void *)0) && ((tpss_pthread_attr_destroy_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi_pthread_attr_destroy)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_push_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_push)]))->trampoline)) != ((void *)0) && ((tpss__pthread_cleanup_pop_call_t)(((((tpss_probe_t*)g_tpss_probes_table) + g_tpss_pt_id[(tpss_pi__pthread_cleanup_pop)]))->trampoline)) != ((void *)0)' failed. strace: Process 405798 detached vtune: Collection stopped. ... (output continues) ...
I get similar tpss_deepbind_notify_on_pthread_loaded assertions for other rendering applications.
I am using the Mesa graphics driver:
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) UHD Graphics 620 (Kabylake GT2)
Do you have any advice how to resolve this issue?
I have attached the resulting analysis file for you.
The same error results without TPSS_DEBUG=1 and without MESA_GLSL_CACHE_DISABLE=true.
regards,
Christian