Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all 1347 articles
Browse latest View live

Hardware events at each sample @1ms

$
0
0

I have been looking for a way to get hardware events for each sample at a sampling period of 1ms.

I started looking into Intel Vtune as it allows sampling hardware events at 1ms, However, I haven't been able to get the hardware events count for each sample. Intel Vtune provides the accumulated hardware events count for an application but it would be really helpful for me if I could find a way to get the hardware events count at every sample(@1ms).

Following posts are similar to what I am looking for :

https://software.intel.com/en-us/forums/intel-vtune-amplifier-xe/topic/7...

https://software.intel.com/en-us/forums/intel-vtune-amplifier-xe/topic/5...

Any help will be highly appreciated. If you can provide command line commands to get this data it would be really helpful.

I am really stuck at this moment due to the unavailability of these data and I am really looking forward to getting help from you guys. What I need is a timeline or trace of all the actual count of hardware events count. Also, please let me know if this cannot be done in Intel VTune and that I should look into some other tools/utilities.

Thanks!


unexpected NMI received -- with or without SEP

$
0
0

We have KNL and SKX systems running CentOS kernel 3.10.0-693.17.1.

The KNL systems are currently running the Intel sep4_1 driver that came with VTune amplifier 2018.0.2 build 525261, while the SKX systems are running with the "perf events" driver.

In both cases, attempting to use the "-collect memory-access" option to amplxe-cl results in repeated kernel emergency messages along the lines of:

Uhhuh. NMI received for unknown reason xx on CPU yy.

Do you have a strange power-saving mode enabled?

Dazed and confused, but trying to continue

On the KNL systems the "unknown reason" alternates between 29 and 39, and the message typically shows up for all cores.   On SKX systems the "unknown reason" typically alternates between 20 and 30, and the message also typically shows up for all cores.

The nodes don't crash -- indeed, the amplxe-cl job finishes and prints out its summary report.  BUT, these messages printed by "pr_emerg()" are echoed to all root windows on the master node, where they make the system operators cranky.  Cranky operators often kill the offending jobs.

On the SKX nodes, about the time the unexpected NMIs start, we see a handful of messages like:

INFO: NMI handler (perf_event_nmi_handler) took too long to run: 585758.001 msec

and sometimes:

hrtimer: interrupt took 25076958 ns

The perf_event_nmi_handler message seems weird -- 5857578 msec is almost 10 minutes, and this message appeared within 3 minutes of the start of the job.   The hrtimer number (25 second) is more plausible, but no less concerning.

On the KNL nodes (running sep), there are no other interesting messages in the log -- just repetitions of the trio of "Dazed and confused" messages for the duration of the job.   The log that I am staring at now repeats this trio of messages 1722 times during the 18 minutes that VTune was running, then everything appears to have returned to normal.

As a short-term workaround, I have found that collecting uncore counters "manually" using "-collect-with runsa -knob event-config=..." does data collection without generating irritating kernel messages, but I have not looked in detail at the collected data.

In the slightly longer term, we plan to install and test Intel Parallel Studio 2018 update 2 along with the corresponding SEP kernel module.   Does anyone know if this is likely to provide any benefit with regard to this class of problems?

 

[Solved][yocto Linux] [Remote target] Error: Amplifier cannot detect remote machine configuration

$
0
0

Hi,

I am trying to setup VTune in order to report some gpu metrics on a remote target, a custom Linux distro build with the help of Yocto.

I appended the vtune recipe in the distro image to embed the vtune drivers onto the board. The installation sounds ok:

 

root@me:~# lsmod
Module                  Size  Used by
vtsspp                352256  0
sep4_1                774144  0
socperf2_0             32768  1 sep4_1
pax                    16384  0
intel_rapl             20480  0
pwm_lpss_pci           16384  0
x86_pkg_temp_thermal    16384  0
pwm_lpss               16384  1 pwm_lpss_pci
igb                   172032  0
coretemp               16384  0
spi_pxa2xx_platform    24576  0
i915                 1351680  0
mei_me                 28672  0
mei                    61440  1 mei_me
uio                    16384  0

root@me:/opt/intel# ll vtune_amplifier_2018.2.0.551022*
vtune_amplifier_2018.2.0.551022:
total 44
drwxr-xr-x 2 root root 4096 May 15 03:58 bin32
drwxr-xr-x 2 root root 4096 May 15 03:58 bin64
drwxr-xr-x 4 root root 4096 May 15 03:58 config
drwxr-xr-x 3 root root 4096 May 15 03:58 documentation
drwxr-xr-x 5 root root 4096 May 15 03:58 lib32
drwxr-xr-x 5 root root 4096 May 15 03:58 lib64
drwxr-xr-x 3 root root 4096 May 15 03:58 message
drwxr-xr-x 3 root root 4096 May 15 03:58 resource
-rwxr-xr-x 1 root root 1899 Mar 14 13:00 sep_vars.sh
-rwxr-xr-x 1 root root 2059 Mar 14 13:00 sep_vars_busybox.sh
drwxr-xr-x 5 root root 4096 May 15 03:58 sepdk

vtune_amplifier_2018.2.0.551022_drivers:
total 324
-rwxr-xr-x 1 root root  20505 May 14 12:19 boot-script
-rwxr-xr-x 1 root root  24136 May 14 12:19 insmod-sep
drwxr-xr-x 2 root root   4096 May 14 12:19 pax
-rwxr-xr-x 1 root root  10041 May 14 12:19 rmmod-sep
-rw-r--r-- 1 root root 255848 May 14 12:19 sep4_1-x32_64-4.14.33-intel-pk-standardsmp.ko
drwxr-xr-x 3 root root   4096 May 14 12:19 socperf
drwxr-xr-x 2 root root   4096 May 14 12:19 vtsspp

 

But Vtune failed to run on the following error (same issue triggered from the IDE):

[ bin64]$ ./amplxe-cl -v -target-system=ssh:user@X.Y.0.5 -collect gpu-hotspots -target-pid 252
amplxe: Using target: ssh:user@X.Y.0.5
amplxe: Cannot find product on the device. Enabling automatic installation...
amplxe: Installing the package to user@X.Y.0.5
amplxe: Error: Could not copy /opt/intel/system_studio_2018/vtune_amplifier_2018.2.0.551022/target/linux/vtune_amplifier_target_x86.tgz to /opt/intel/vtune_amplifier_2018.2.0.551022/ on the target.
Make sure VTune Amplifier installation directory on the remote system option in the Analysis Target tab is set to the correct writable path.
Alternatively, you may use the --target-install-dir option to specify the correct path from command line.
amplxe: Error: Amplifier cannot detect remote machine configuration.
amplxe: Error: Could not copy /opt/intel/system_studio_2018/vtune_amplifier_2018.2.0.551022/target/linux/vtune_amplifier_target_x86.tgz to /opt/intel/vtune_amplifier_2018.2.0.551022/ on the target.
Make sure VTune Amplifier installation directory on the remote system option in the Analysis Target tab is set to the correct writable path.
Alternatively, you may use the --target-install-dir option to specify the correct path from command line.
amplxe: Error: Amplifier cannot detect remote machine configuration.

[bin64]$ ./amplxe-cl -v -target-system=ssh:root@X.Y.0.5 -collect gpu-hotspots -target-pid 252
amplxe: Using target: ssh:root@X.Y.0.5
amplxe: Cannot find product on the device. Enabling automatic installation...
amplxe: Installing the package to root@X.Y.0.5
amplxe: Error: Amplifier cannot detect remote machine configuration.
amplxe: Error: Amplifier cannot detect remote machine configuration.

 

On the first run, the command tries to install the drivers at the same place than they already are, and failed because the user is not allowed to write onto this location. On the second, as root,  it only complains about a missing configuration.

What that vtune is looking for exactly ? How to make it run correctly on a my remote device ?

 

 

 

[Yocto linux] [Media sdk] gpu profiling

$
0
0

Hi there.

The intel media sdk is installed on the target and works well. I' m mainly using the `sample_encode` tool. But the intel driver isn't recognized by vtune:

I follow the steps from https://software.intel.com/en-us/vtune-amplifier-help-2019-beta-intel-media-sdk-program-analysis#MFX and the requested paths (INTEL_LIBITTNOTIFY64=/opt/intel/vtune_amplifier_xe/lib64/runtime/libittnotify_collector.so) are exported.
 

/opt/intel/system_studio_2018/vtune_amplifier/bin64/amplxe-cl -v -target-system=ssh:root@X.Y.0.5 -collect gpu-hotspots datatest/test.sh
amplxe: Using target: ssh:root@X.Y.0.5
amplxe: Error: Intel Graphics kernel module is not detected.
amplxe: Error: Cannot collect GPU hardware metrics. Make sure the Intel OpenCL SDK or Intel Media SDK is installed.

The Media SDK and HD driver were compiled from the opensource github repo:

How to fulfill the vtune requirements about HD gpu ?

 libva output:

 vainfo
error: can't connect to X server!
libva info: VA-API version 1.1.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /usr/lib/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_1
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.1 (libva 2.1.1.pre1)
vainfo: Driver version: Intel iHD driver - 2.0.0
vainfo: Supported profile and entrypoints
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileNone                   : VAEntrypointStats
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointFEI
      VAProfileH264Main               : VAEntrypointEncSliceLP
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointFEI
      VAProfileH264High               : VAEntrypointEncSliceLP
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointFEI
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointFEI
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD

 

 

Problem with compiling SEP kernel driver on CentOS 7

$
0
0

Compilation of SEP kernel driver on CentOS 7 (kernel 3.10.0-862.2.3.el7.x86_64) terminates with the errors below. Is there any way to install the driver for this kernel version?

$ ./build-driver

Options in brackets "[ ... ]" indicate default values
that will be used when only the ENTER key is pressed.

C compiler to use: [ /bin/gcc ]

Make command to use: [ /bin/make ]

Kernel source directory: [ /lib/modules/3.10.0-862.2.3.el7.x86_64/build ]
rm -f *.o .*.o.cmd .*.o.d .*.ko.cmd .*.ko.unsigned.cmd *.gcno
rm -f sep4_1.ko sep4_1.ko.unsigned
rm -f sep4_1*x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
rm -f Module.symvers Modules.symvers *.mod.c modules.order Module.markers
rm -rf .tmp_versions
make[1]: Entering directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax'
rm -f *.o .*.o.cmd .*.o.d .*.ko.cmd .*.ko.unsigned.cmd *.gcno
rm -f pax.ko pax.ko.unsigned pax-x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
rm -f Module.symvers Modules.symvers *.mod.c modules.order Module.markers
rm -rf .tmp_versions
make[1]: Leaving directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax'
make[1]: Entering directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src'
rm -f *.o .*.o.cmd .*.o.d .*.ko.cmd .*.ko.unsigned.cmd *.gcno
rm -f socperf2_0.ko socperf2_0.ko.unsigned
rm -f socperf2_0*x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
rm -f Module.symvers Modules.symvers *.mod.c modules.order Module.markers
rm -rf .tmp_versions
make[1]: Leaving directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src'
make[1]: Entering directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp'
rm -f *.o .*.o.cmd .*.o.d .*.ko.cmd .*.ko.unsigned.cmd *.gcno
rm -f Module.symvers Modules.symvers *.mod.c modules.order Module.markers
rm -rf .tmp_versions
rm -f /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/vtss_autoconf.h
rm -f vtsspp.ko vtsspp.ko.unsigned vtsspp-x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
make[1]: Leaving directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp'
make[1]: Entering directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src'
/bin/make -C /lib/modules/3.10.0-862.2.3.el7.x86_64/build M=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src LDDINCDIR=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/../include LDDINCDIR1=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/inc modules PWD=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src -j4
make[2]: Entering directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/socperfdrv.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/control.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/utility.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/pci.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/soc_uncore.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/haswellunc_sa.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/npk_uncore.o
  LD [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/socperf2_0.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/socperf2_0.mod.o
  LD [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src/socperf2_0.ko
make[2]: Leaving directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
cp socperf2_0.ko socperf2_0-x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
make[1]: Leaving directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/socperf/src'
/bin/make -C /lib/modules/3.10.0-862.2.3.el7.x86_64/build M=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src LDDINCDIR=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/../include LDDINCDIR1=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/inc modules PWD=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src
make[1]: Entering directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/lwpmudrv.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/control.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/cpumon.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/eventmux.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/linuxos.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/output.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pmi.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sys_info.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/utility.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/valleyview_sochap.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_power.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pci.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/chap.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/gmch.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/gfx.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_sa.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/core2.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/perfver4.o
  AS [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sys64.o
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sys64.o: warning: objtool: .text+0x3: return instruction outside of a callable function
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sys64.o: warning: objtool: .text+0x7: return instruction outside of a callable function
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sys64.o: warning: objtool: .text+0x8: return instruction outside of a callable function
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/silvermont.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/apic.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pebs.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_gt.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_mmio.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_msr.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_common.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/unc_pci.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sepdrv_p_state.o
  LD [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sep4_1.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sep4_1.mod.o
  LD [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/sep4_1.ko
make[1]: Leaving directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
cp sep4_1.ko sep4_1-x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
make[1]: Entering directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax'
/bin/make -C /lib/modules/3.10.0-862.2.3.el7.x86_64/build M=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax LDDINCDIR=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax/../../include LDDINCDIR1=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax/../inc modules PWD=/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax
make[2]: Entering directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax/pax.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax/pax.mod.o
  LD [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax/pax.ko
make[2]: Leaving directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
cp pax.ko pax-x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
make[1]: Leaving directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/pax'
make[1]: Entering directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp'
make[2]: Entering directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/module.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/collector.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/procfs.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/transport.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/record.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/task_map.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/globals.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/cpuevents.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/user_vm.o
  CC [M]  /opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.o
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:299:21: error: variable ‘vtss_stack_ops’ has initializer but incomplete type
 static const struct stacktrace_ops vtss_stack_ops = {
                     ^
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:304:5: error: unknown field ‘stack’ specified in initializer
     .stack          = vtss_stack_stack,
     ^
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:304:5: error: excess elements in struct initializer [-Werror]
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:304:5: error: (near initialization for ‘vtss_stack_ops’) [-Werror]
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:306:5: error: unknown field ‘address’ specified in initializer
     .address        = vtss_stack_address,
     ^
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:306:5: error: excess elements in struct initializer [-Werror]
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:306:5: error: (near initialization for ‘vtss_stack_ops’) [-Werror]
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c: In function ‘vtss_stack_unwind_kernel’:
/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.c:329:5: error: implicit declaration of function ‘dump_trace’ [-Werror=implicit-function-declaration]
     dump_trace(task, regs_in, NULL, &vtss_stack_ops, &k_stk);
     ^
cc1: all warnings being treated as errors
make[3]: *** [/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp/stack.o] Error 1
make[2]: *** [_module_/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp] Error 2
make[2]: Leaving directory `/usr/src/kernels/3.10.0-862.2.3.el7.x86_64'
make[1]: [all] Error 2 (ignored)
cp -f vtsspp.ko vtsspp-x32_64-3.10.0-862.2.3.el7.x86_64smp.ko
cp: cannot stat ‘vtsspp.ko’: No such file or directory
make[1]: [all] Error 1 (ignored)
make[1]: Leaving directory `/opt/intel/vtune_amplifier_2018.2.0.551022/sepdk/src/vtsspp'

 

[Yocto Linux] [GPU] cpugpu-concurrency remains stuck

$
0
0

Running a custom distro, I 'm now able to run *Vtune* remotely and get results.
Moreover, I am able to launch a GPU hotspot analysis, thanks again to Pavel.
But, when launching a "GPU/CPU concurrency" task, if the process to be checked starts and stop correctly the vtune program remains stuck and never ends.
See `ps alx` following output:

1136 ?        Ss     0:00 sh -c sh -c 'echo _pvi_ 1>&2 ; /opt/intel/vtune_amplifier_2018.2.0.551022/bin$(if [ `uname -m` = x86_64 ] || [ `uname -m` = amd64 ]; then echo 64; else echo 32;fi)/amplxe-runss -V 1>&2 ;
 1146 ?        Sl     0:00 /opt/intel/vtune_amplifier_2018.2.0.551022/bin64/amplxe-runss --result-dir /tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc --option-file /tmp/root@adse3950_r006cgc.opts
 1161 ?        D      0:00 /opt/intel/vtune_amplifier_2018.2.0.551022/bin64/sep -start -experimental -uem timer=10 -out /tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc/data.0/sep7f0d5a1a4700.20180514T1028
 1182 ?        S      0:00 /opt/intel/vtune_amplifier_2018.2.0.551022/bin64/sep -stop

If I understand this well, `amplxe` has detected that the program to be checked has ended, then send a `sep stop` to terminate `sep` ( collector process ?)
and wait for it to end. But this one is locked down ("D" state: non interruptible).

Here the output extract from `ps -s`

UID   PID          PENDING          BLOCKED          IGNORED           CAUGHT STAT TTY        TIME COMMAND
   0  1136 0000000000000000 0000000000010000 0000000000000004 0000000000010002 Ss   ?          0:00 sh -c sh -c 'echo _pvi_ 1>&2 ; /opt/intel/vtune_amplifier_2018.2.0.551022/bin$(if [ `uname -m` = x86_64 ] || [ `uname -m` = amd64 ]; then echo 64; else echo 32;fi)/amplxe-runss -V 1>&2 ; echo /_pvi_  1>&2 ;' chmod 600 /tmp/root@adse3950_r006cgc.opts ; mkdir -p /tmp/tmpTTnq5j ; mkdir -p /tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc/log/target ; sh -c 'cd "/home/root/datatest"&& AMPLXE_LOG_DIR=/tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc/log/target /opt/intel/vtune_amplifier_2018.2.0.551022/bin$(if [ `uname -m` = x86_64 ] || [ `uname -m` = amd64 ]; then echo 64; else echo 32;fi)/amplxe-runss --result-dir /tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc --option-file /tmp/root@adse3950_r006cgc.opts'    0  1146 0000000000000000 fffffffe7ffbfa37 0000000000000000 00000001c1004eae Sl   ?          0:00 /opt/intel/vtune_amplifier_2018.2.0.551022/bin64/amplxe-runss --result-dir /tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc --option-file /tmp/root@adse3950_r006cgc.opts
    0  1161 0000000000000000 0000000000000000 0000000000000002 0000000180000000 Dl   ?          0:00 /opt/intel/vtune_amplifier_2018.2.0.551022/bin64/sep -start -experimental -uem timer=10 -out /tmp/amplxe-results-root/root_adse3950/tmpTTnq5j/r006cgc/data.0/sep7f0d5a1a4700.20180514T102814.847811 -ec INST_RETIRED.ANY:sa=1600000,CPU_CLK_UNHALTED.CORE:sa=1600000,CPU_CLK_UNHALTED.REF_TSC:sa=1600000,UNC_SOC_All_BW, -d 0 -uem factor=10
    0  1182 0000000000000000 0000000000000000 0000000000000002 0000000180000000 S    ?          0:00 /opt/intel/vtune_amplifier_2018.2.0.551022/bin64/sep -stop




The only way to get off this situation is to hard reset the board.

Any clue to avoid this ?

 

 

 

 

Application performance snapshot can't find libmps.so

$
0
0

I'm trying to run Application performance snapshot 2018 update 2, and I get this error :

 

[sajid@xrmlite APS_2018_update2_lin_551022]$ mpirun -np 2 ./aps python /home/sajid/packages/zone_plate_testing/zp_rotation/zp_make_hdf5.py
Emon collector successfully stopped.
aps Error: python: symbol lookup error: /home/sajid/Downloads/APS_2018_update2_lin_551022/internal/lib64/libmps.so: undefined symbol: PMPI_Initialized
aps Error: Cannot run the collection.
aps Error: Failed to detect the collection configuration.

Recipe: How to detect Scheduling Overhead in Intel® Threading Building Blocks (Intel TBB) Apps

$
0
0

Hello VTune Users,

Let us share one more VTune Amplifier Performance Analysis Cookbook recipe article on threading efficiency analysis https://software.intel.com/en-us/vtune-amplifier-cookbook-intel-tbb-scheduling-overhead. The article helps to detect parallel constructs with scheduling overhead in Intel TBB applications based on VTune Amplifier Concurrency analysis use.

It would be great to hear your feedback on the article usefulness, format convenience and what we can do more to make threading analysis in VTune Amplifier better.

Thanks & Regards, Dmitry

 


abort() in libtpsstool.so while using vune_2018_update2

$
0
0

i use the vtune_amplifier_2018_update2, it prints the following error and then abort() in libtpsstool.so:
# amplxe-cl -collect hotspots -r r001hs ./a.out
vcs/collectunits1/tmu/src/tmu.c:460 write_trace: Assertion 'compressor: can't process buffer' failed.
vcs/tpss2/tpss/src/tpss/runtime/linux/exe/tpss_deepbind.c:1344 applibc___errno_location: Assertion 'is_control_service_thread_current() == 0' failed.

......

CPU: Intel(R) Xeon(R) CPU E5-2630 v2
OS: CentOS 6.2

Announcing Intel® VTune™ Amplifier’s Platform Profiler Feature!

$
0
0

Is your system correctly configured for your workloads? Would you benefit from more memory or I/O? Are your workloads well-optimized? Are you using memory and storage efficiently? Do you have non-uniform memory issues?

If you answer YES to any of these questions, Intel® VTune™ Amplifier’s Platform Profiler may be right for you.

It helps:

  • Software architects tune long-running workload
  • Infrastructure architects configure systems efficiently

Historically, VTune Amplifier has provided detailed performance data collected over a few seconds or minutes—great in many scenarios, but sub-par for longer studies where seeing the “big picture” is needed.

Platform Profiler mitigates this issue. It delivers the big picture view over a longer period of time so you can see:

  • which workloads are running well
  • which need tuning
  • which would benefit from a different system configuration

 

Figure 1: Example metrics display

 

Try Platform Profiler today

You are invited to try a free technical preview release. Just follow these simple steps (if you are already registered, skip to step 2):

  1. Register for the Intel® Parallel Studio XE Beta
  2. Download and install (the Platform Profiler is a separate download from the Intel Parallel Studio XE Beta)
  3. Check out the getting started guide, then give Platform Profiler a test drive
  4. Fill out the online survey

Profiling Python code with virtualenv?

$
0
0

I'm using Intel VTune XE 2016 to profile a program written in Python. Here's what I did:

1. source virtualenv
2. amplxe-cl -collect hotspots <Python script> <config file path>

But I got this at the beginning of the output:

    amplxe: Error: Binary file of the analysis target does not contain symbols required for profiling. See the 'Analyzing Statically Linked Binaries' help topic for more details.
    amplxe: Error: [2018.05.21 11:11:46] /usr/sbin/ldconfig _init() instrumentation failed. Profiling data may be missing.
    amplxe: Error: Binary file of the analysis target does not contain symbols required for profiling. See the 'Analyzing Statically Linked Binaries' help topic for more details.
    amplxe: Error: [2018.05.21 11:11:46] /usr/sbin/ldconfig _init() instrumentation failed. Profiling data may be missing.

Any ideas about this? Thanks for your help.

Difference between Basic and Advanced hotspots

$
0
0

Hi,

I am experiencing a difference between the basic and advanced hotspots results, adn I cannot explain that.
I would expect similar results. The same code, same data were used for both.
Please, see attached picture.
Thanks for your help,

Peter

AttachmentSize
Downloadimage/pngVTune discrepancy.png224.98 KB

Stuck Finalizing

$
0
0

Hi, twice I successfully captured a Basic hotspots capture but it has started to get stuck in the Finalizing stage, following previous thread I have ran via cmd line and run a dump, I am using the latest version of VTune - it updated today in fact. Using the GUI I am unable to cancel the finalization and have had to forcibly terminate the vtune apps in task manager.

 

 

Recipe: How to detect an overhead on memory accesses for a PMDK-based application

$
0
0

Hello VTune Users,

Let us share one more VTune Amplifier Performance Analysis Cookbook recipe article on Memory Access analysis https://software.intel.com/en-us/vtune-amplifier-cookbook-pmdk-applicati.... The article helps to detect an overhead caused by PMDK library calls in an application that utilizes persistent memory based on VTune Amplifier Memory Access analysis use.

It would be great to hear your feedback on the article usefulness, format convenience and what we can do more to make memory analysis in VTune Amplifier better.

Best Regards,

Dmitry

Missing Release Notes - Intel VTune Amplifier 2018 Update 3 for Windows

$
0
0

Hi,

I got several notifications about *VTune Amplifier 2018 Update 3* being available but the linked release notes do not mention anything at all.
Could we please get some information about it.

Thank you,
Dietmar
 


Problem Loading SEP Module (Sampling Drivers) CentOS Remote Capture

$
0
0

 

Version: Linux version 3.10.0-693.21.1.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Wed Mar 7 19:03:37 UTC 2018

I am able to successful compile the drivers using ./build-drivers and pointing at /usr/src/kernels/3.10.0-693.21.1.el7.x86_64

However when I got to install the modules I get the following output:

./insmod-sep -r

Warning:  the following driver(s) were not found loaded in the kernel:  sep4_1.

Warning:  no vtsspp driver was found loaded in the kernel.
Removing socperf2_0 driver from the kernel ... done.
Deleting /dev/socperf2_0 devices ... done.
The socperf2_0 driver has been successfully unloaded.
Attempting to stop PAX service ...
Removing pax driver from the kernel ... done.
Deleting previously created /dev/pax device ... done.
The pax driver has been successfully unloaded.
PAX service has been stopped.
Checking for PMU arbitration service (PAX) ... not detected.
Attempting to start PAX service ...
Executing: insmod ./pax/pax-x32_64-3.10.0-693.21.1.el7.x86_64smp.ko
Creating /dev/pax device with major number 246 ... done.
Setting group ownership of devices to group "vtune" ... done.
Setting file permissions on devices to "666" ... done.
The pax driver has been successfully loaded.
PAX service has been started.
Checking for socperf driver ... not detected.
Executing: insmod ./socperf/src/socperf2_0-x32_64-3.10.0-693.21.1.el7.x86_64smp.ko
Creating /dev/socperf2_0 base devices with major number 245 ... done.
Setting group ownership of devices to group "vtune" ... done.
Setting file permissions on devices to "666" ... done.
The socperf2_0 driver has been successfully loaded.
Executing: insmod ./sep4_1-x32_64-3.10.0-693.21.1.el7.x86_64smp.ko
insmod: ERROR: could not insert module ./sep4_1-x32_64-3.10.0-693.21.1.el7.x86_64smp.ko: Invalid parameters

Error:  sep4_1 driver failed to load!

You may need to build sep4_1 driver for your kernel.
Please see the sep4_1 driver README for instructions.

 

Running dmesg I can see that some symbols are coming up as mismatched versions:

dmesg | tail
[1767003.842646] sep4_1: disagrees about version of symbol trace_event_raw_init
[1767003.842648] sep4_1: Unknown symbol trace_event_raw_init (err -22)
[1767003.842686] sep4_1: disagrees about version of symbol ftrace_event_reg
[1767003.842687] sep4_1: Unknown symbol ftrace_event_reg (err -22)
[1767003.842706] sep4_1: disagrees about version of symbol trace_define_field
[1767003.842708] sep4_1: Unknown symbol trace_define_field (err -22)
[1767003.842709] sep4_1: disagrees about version of symbol trace_event_buffer_lock_reserve
[1767003.842711] sep4_1: Unknown symbol trace_event_buffer_lock_reserve (err -22)
[1767003.842724] sep4_1: disagrees about version of symbol filter_current_check_discard
[1767003.842725] sep4_1: Unknown symbol filter_current_check_discard (err -22)

 

Are there additional steps I need to take to get remote capture working? I went ahead tried to execute a remote capture anyway and got this error in the GUI (from a windows machine if that matters). I am attempting to do a "Profile System" capture.

Collection failed

Jun 06 2018 21:12:31 Collection failed. The data cannot be displayed.

To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.

The following events cannot be collected: CPU_CLK_UNHALTED.THREAD_P_ANY. Consider removing the events from the collection, loading the VTune Amplifier sampling driver using the root credentials, or updating the OS kernel.

 

Thanks!

Function execution time is wrong !?

$
0
0

Hi. I'm very new one with VTune. 
I'm trying to examine a simple programm with two functions, executed one after another, with the same "for" cycle in them (which do some calculations to take more time).

void single()
{
 double tmp;
 for( int i = 1; i<N; ++i )
 {
  tmp = (i*12+i/5+i/10)/i+i*2;
  for(int j = 0; j<10; ++j)
   tmp = (i*12+i/5+i/10)/i+i*2+pow(j,1.35);
 }
 return;
}

void multi()
{
 omp_set_num_threads(4);
 double tmp;
 #pragma omp parallel for private(tmp)
 for( int i = 1; i<N; ++i )
 {
  tmp = (i*12+i/5+i/10)/i+i*2;
  for(int j = 0; j<10; ++j)
   tmp = (i*12+i/5+i/10)/i+i*2+pow(j,1.35);
 }
 return;
​}

And according with VTune Basic Hotspots analysis
the one with omp instructions (multi) were executed 2.5 s
and another one (single) were executed 1.96 s

It is strange because my simple std::clock_t begin, end; method proof that parallel one executed 2 times faster
and even VTune core usage graphs show the same proportion (the moment 4 threads are use - the moment multi function execute). But times show something else!

Is it possible to see a real times (or real proportion between this two function)?
 

 

export filtered timeline graphs

$
0
0

Hi, is it possible to export some range of timeline graphs like this one as a csv file? 

SGX call stack collection / analysis

$
0
0

Hi,

I'm perf analyzing SGX, I used hotspots to improve performance of functions in my code and got to a point where high hitters are "system" calls (i.e. _intel_avx_rep_memcpy, malloc, memchr, _intel_avx_rep_memset, free, etc.).

I believe that using stack informaiton VTune will build a meaningful Top-down tree which will allow me to capture whether there are blocks in my code that are calling "system" calls intensively.

Is there a way to enable call stack collection for SGX analysis? or other methods that will allow me to have a meaningful top-down tree?

IO susbsystem profiling sampling rate

$
0
0

Hello,

I am trying to measure utilization of the IO subsystem of a long running application with large number of threads on KNL. But data collection stops early due to the 500MB limit. I wanted to know is there a way I can increase the sampling rate, so that the application is profiled for the full duration.

Below is the command I am using and it looks like there is no knob for collect io to increase the sampling rate.

amplxe-cl -collect io -analyze-system -- application

Thank you,

Ram

Viewing all 1347 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>