Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all 1347 articles
Browse latest View live

Error: The following events cannot be collected

$
0
0

Hi,

I'm running VTune Amplifier 2019.5.0.601413 with command line and get the error while trying to collect uarch-exploration or memory-access. Here's the full output:

amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
amplxe: Error: The following events cannot be collected: MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM,MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM,MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_FWD,MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM,MEM_LOAD_UOPS_RETIRED.L1_MISS. Consider removing the events from the collection, loading the VTune Amplifier sampling driver using the root credentials, or updating the OS kernel.
amplxe: Collection failed.

I'm running from native Ubuntu 14.04 on an ivy bridge CPU. The above hardware events, as described in the error, can be collected using PCM library though. I was able to modify those control registers and read out contents of the counters.

Could you please help? Thanks! 

Mark


Vtune 2019 update 2 - 'amplxe-cl -collect hpc-performance' return 0.0% of Effective CPU Utilization

$
0
0

Hello,

My Vtune installation issues suspicious behavior on skylake processors.

  • SEP driver informations :

Compilation command line :

# ./build-driver -pu -ni

Version :

Sampling Enabling Product version: 5.6  built on Jan 26 2019 19:14:20
SEP User Mode Version: 5.6
SEP Driver Version: 5.6
PAX Driver Version: 1.0
Platform type: 111
CPU name: Intel(R) Xeon(R) Processor code named Skylake
PMU: skylake_server
Driver configs: Non-Maskable Interrupt
Copyright(C) 2007-2018 Intel Corporation. All rights reserved.

 

  • amplxe-cl  output

# amplxe-cl -collect hpc-performance dd if=/dev/zero of=/dev/null count=1000000

[...]

Effective CPU Utilization: 0.0%
 | The metric value is low, which may signal a poor logical CPU cores
 | utilization caused by load imbalance, threading runtime overhead, contended
 | synchronization, or thread/process underutilization. Explore sub-metrics to
 | estimate the efficiency of MPI and OpenMP parallelism or run the Locks and
 | Waits analysis to identify parallel bottlenecks for other parallel runtimes.
 |
    Average Effective CPU Utilization: 0.000 out of 48

 

  • amplxe-self-checker output

# amplxe-self-checker.sh

[...]

HW event-based analysis check (Intel driver)
Example of analysis types: Hotspots with knob sampling-mode=hw, HPC Performance Characterization, etc.
    Collection: Ok
    Finalization: Ok
    Report: Fail
amplxe: Error: 0x40000024 (No data) -- No data is collected. Possible reasons:

HW event-based analysis check (Intel driver)
Example of analysis types: Microarchitecture Exploration
    Collection: Ok
    Finalization: Ok
    Report: Fail

HW event-based analysis with uncore events (Intel driver)
Example of analysis types: Memory Access
    Collection: Ok
    Finalization: Ok
    Report: Fail

 

Thank you

Florian

Cannot cancel file dialog when opening a "RECENT RESULTS" file.

$
0
0

Running Windows 10 Pro version 1903. Running Microsoft Visual Studio 2019 version 16.2.4. Intel Parallel Studio XE 2019 Update 4 was added as an extension. I launched VTune Welcome Page from the menu bar icon. The window that usually contains "Configure Analysis..." was blank. I decided to open a file in the RECENT RESULTS list. An Open file dialog is launched asking me to select a VTune results file, which indicates the results file cannot be found (it does exist). I press the Cancel button without navigating away from the default folder the dialog uses. The dialog closes but then it relaunches itself. Cancel again, dialog closes and launches again (etc.).

I then navigated to the VTune results file and opened it. Closed the file. Launched VTune Welcome page again from the menu bar icon. This time the "Configure Analysis..." shows up in the window. When I click on the RECENT RESULTS list on any file, that file is found and opened (no file dialog this time).

I can now run the tool, but just posting the scenario in case other folks run into the same problem.

VPP platform overhead...

$
0
0

During a recent webinar, I believe I heard it claimed that there's less than 5% overhead for running the collector. Just how is that computed? There's whatever the impact is running a workload after "vpp start" and that does seem relatively minimal (I say seems, because I'm using a desktop not a dedicated server) but ...

  1. Just what resources are being used (large memory array? temp file(s)??). 
  2. Just what is the impact of vpp stop ? It seems to spawn a very CPU intensive python task ... which runs for several minutes ... that seems likely to have noticable impact...
  3. Similarly for vpp upload (is that "just" a ftp (scp?) transfer? 

I would like to be able to deploy vpp to production servers ... but I expect significant pushback if there's periodic impact to running services (obviously, we'll have to do our own measurements) ... if there are ways to mitigate impact (is there a vpp stop internal step which would just store the file, and allow for processing (whatever is going on) to be done off the SUT? Can it be configured to do its work with lower impact (perhaps taking longer to execute ... if/when we're in production it won't matter how long it takes for the results to become available...)

is there some way to apply vpp to itself? that is to use another instance to monitor what's going on during the vpp stop? That might provide compelling evidence of minimal impact ... if, it is, in fact, minimal during this phase of processing.

ON an unrelated note, I'm more than a little disappointed that I didn't get any responses from my Ubuntu19 installation question ...

vpp collector: Throughput Metrics...

$
0
0

My reports say "Missing bounds information....collector is missing information about Max Memory Bandwidth) to collect this data reinstall and rerun the data collector" as this is a fresh installation (CentOS 7.6) how does this omission come about?

vpp collector: zero length collection file?

$
0
0

vpp: aliased to /opt/intel/vtune_amplifier/vpp/collector/vpp-collect

dhcp-10-1-208-197:/home/khb>vpp start
Gathering Platform Profiler collection data. Run vpp-collect stop to finish the collection.
dhcp-10-1-208-197:/home/khb>vpp stop
Collection stopped.
Collection results saved in /home/khb/dhcp-10-1-208-197_20190911-1236.tar.gz.
dhcp-10-1-208-197:/home/khb>ls -lt *.gz
-rw-r--r--. 1 khb khb 0 Sep 11 13:11 dhcp-10-1-208-197_20190911-1236.tar.gz

why would this happen? I did drive the SUT to a load of 60 and swap space was minimal .... 

'No data to show' after profiling

$
0
0

Hi.

I'm using VTune Amplifier of RHEL 7.2, installed on VirtualBox VM, and trying to analyse C++ application.
VTune attaches to the process (by pid), starts data collection, but when I run test scenario and press 'Stop', it shows me 'No data to show. The collected data is not sufficient.' message on Summary page, and no any information about functions called, their execution time on other pages.

Can you please tell me, why this could happen and how can I fix it? Thanks.

Rebuild and Install the Kernel for GPU Analysis in OpenSuse

$
0
0

I am trying to profile an MPI+CUDA application using Intel VTune Amplifier on OpenSuse Leap 15.0. However when I pass the -collect option as "cpugpu-concurrency" I am facing an error as below:

amplxe: Warning: Collection of GPU usage events cannot be enabled. i915 ftrace events are not available.
amplxe: Warning: Collection of GPU usage events cannot be enabled. i915 ftrace events are not available.
amplxe: Error: Ftrace is already in use. Make sure to stop previous collection first.
amplxe: Collection failed.
amplxe: Fatal error: Unknown critical error
amplxe: Internal Error
 

I checked the CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS option in the config file under boot directory and the following options were not set. 

#
# drm/i915 Debugging
#
# CONFIG_DRM_I915_WERROR is not set
# CONFIG_DRM_I915_DEBUG is not set
# CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS is not set
# CONFIG_DRM_I915_SW_FENCE_CHECK_DAG is not set
# CONFIG_DRM_I915_SELFTEST is not set
# CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS is not set
# CONFIG_DRM_I915_DEBUG_VBLANK_EVADE is not set

Can anyone kindly provide the steps to modify the config file and rebuild the kernel for Opensuse Leap 15.0?

Thanks in advance.

 


Intel VTune Amplifier 2019 (update6) cannot regonize the proccessor...

$
0
0

Hello guys,

Once I switch to HPC, Microarchitecture Exploration or Memory Access, I'm not able to run it (play button goes grey). Red text appears after description of selected profiling section (Screenshot_2.png). 
Other case is when I'm running Hotspot profiling, Vtune Amplifier crashes. I managed to get part of the log before crashing (Screenshot_3.png). I would really appreciate your help.

OS: Windows Pro
Processor: Intel i7 7700 (3.6 GHz)

All the best,
Corbo Azer

Remote Linux port and personal shared libraries not showing in results

$
0
0

Hello, I've started using Intel VTune yesterday to profile the program I've been working on.

I installed it on my Windows 10 host machine and have been connecting remotely to the Linux Docker where the program is running.

- The first issue I have is that I want to use port 2222 to connect to the Linux Docker, like I do when connecting manually via SSH to that same Linux Docker. For that, we use ''-p 127.0.0.1:2222:22 ^'' in the ''docker run'' command.

I specified ''root@127.0.0.1:2222'' as the SSH destination but the only way it works is that I had to specifiy ''-p 127.0.0.1:22:22 ^'' in the ''docker run'' command.

- The second issue I have is that once I ran one profiling session of a few seconds, I expect to see results that take into account the personal shared libraries I'm using and that contain code that I know is executed.

However, what I see is that over tens of seconds of elapsed time, the CPU time is only 0.010s or such low values.

I added the following the BuildSettings.cmake file:

set(CMAKE_BUILD_TYPE Debug)
set(CMAKE_XCODE_ATTRIBUTE_DEBUG_INFORMATION_FORMAT "dwarf")

and when compiling, the -g appears but VTune is still now seeing any of my personal libraries.

Thank you !

Question on loads, stores and LLC miss count in Memory Access

$
0
0

Hi, all.

I have been trying to profile memory access of an application using VTune.

And I have the following questions about loads, stores and LLC miss count.

 

1. Does load and store count represents loads/stores that occurred in LLC only? or does it counts every single load and store in L1, L2 and LLC?

 

2. I think LLC miss count should be the same with the DRAM access count, but what I got is DRAM access count is larger than LLC miss count. What would make this situation?

 

I attached the images for each question. 

Thanks for answering in advance.

AttachmentSize
Downloadimage/jpegls.JPG16.81 KB
Downloadimage/jpegLLC.JPG11.8 KB

Can I see the result something like timeline-view in VTune?

$
0
0

Hi, all.

 

The attached image is from nvprof. 

I am wondering if I can see the profiling result like the attached image.

So I could see what function is used at certain time with timeline-view.

 

Thanks for answering in advance.

AttachmentSize
Downloadimage/pngtimeline-view.png54.1 KB

Have problem to run VTune Amplifer

$
0
0

I have a Fotran program uses LAPACK library in intel MKL. The program compiles and runs very well in MS Visual Studio. When I tried to analyze the program using "Vtune in Tools->  Intel Vtune Amplifier->Profle with VTune Amplifier" and start to run the program. It shows the following error message:

"The procedure entry point mkl_serv_inspector_suppress could not be located in the dynamic link library  ****" 

See attached image for error.

Can you help to fixe the issue?

Many Thanks!

 

AttachmentSize
Downloadimage/pngCapture.PNG8.9 KB

Failed to connect to VTune Amplifier data provider

$
0
0

I get the following error message when trying to load old results:

However, after deleting the old project and creating a new one, everything works as it should. I just can not close VTune Amplifier and then open it again to continue working on an old project.

127.0.0.1 is indeed excluded from my local proxy settings, and I have no local firewall running on this machine. Not sure about any additional firewalls on behalf of my company though.

Operating system is Linux, Opensuse 42.3

How can I install VTune on vLab with sampling driver?

$
0
0

Hi, all.

I have been trying to install VTune on Intel vLab.

However, it said that root permission is needed to install sampling driver and sampling driver will not be installed since it is a virtual system.

 

Is there any possible way to install VTune with sampling driver on vLab?


Can I collect data in multiple mode?

$
0
0

I'm interested in performance related to vectorization and memory access of my application, and I'm quite newbie in VTune.

 

I use two commands,

amplxe-cl -collect hpc-performance

amplxe-cl -collect memory-access

 

And, the result of each (r@@@hpc, r@@@macc directories are made) has its detailed data that the other doesn't have. But I want both. So, I'm wondering if I can get the result of both of them with one command.

Question about analyzing threading efficiency

$
0
0

Hi, I've been recently profiling PyTorch code using AVX instructions on my 16 core CPU. I'm feeling weird about the results. And I don't know one metric's meaning. Let me share my google drive link to show you my result.

https://drive.google.com/open?id=1dwy_DA6e6M9f9ruvOaR7-yFzOVvBD7jP

If 'H/W Context' is meaning physical core, is the 'VTune Result 6' image file saying that only 9 cores are working after 16000s? It's quite weird and I don't know why it happened. This kind of phenomenon happens when I increases training epoch in my python codes. (It means, increases iterating number of loop. The image files numbered 1~5 are for code with epochs 5, 15, 25, 50, 50. I tried the code of 50 epochs twice)

vtune_amplifier_target_sep_x86.tgz where to download?

$
0
0

Hi, I'm trying to install the vtune_amplifier_target_sep_x86.tgz package. Can someone please point me to where I can download the tar file? The automatic installation has failed. 

 

Thanks, 

Asha

Vtune not able to identify processor type

$
0
0

Hi, Vtune when trying to find the hotspots is not able to identify the processor type. Althought its a MAC OS with i5 installed. Can anyone post a solution for the issue? 

Thanks,

Asha

Basic questions about vtune

$
0
0

Does VTune work with using perf?

Can Vtune work on hardware which has not an x86 intel CPU inside, e.g. an embedded Samsung CPU?

I use the analyzer checking tool

Intel(R) VTune(TM) Amplifier Self Check Utility
Copyright (C) 2009-2019 Intel Corporation. All rights reserved.
Build Number: 602217

Instrumentation based analysis check
Example of analysis types: Hotspots with default knob sampling-mode=sw, Threading with default knob sampling-and-waits=sw
    Collection: Ok
    Finalization: Ok
amplxe: Warning: Cannot locate debugging information for file `/opt/intel/vtune_amplifier_2019.6.0.602217/lib64/libtpsstool.so'.
    Report: Ok

HW event-based analysis check (Perf)
Example of analysis types: Hotspots with knob sampling-mode=hw, HPC Performance Characterization, etc.
    Collection: Ok
amplxe: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel modules symbols.
amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
    Finalization: Ok
    Report: Ok

HW event-based analysis check (Perf)
Example of analysis types: Microarchitecture Exploration
    Collection: Ok
amplxe: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel modules symbols.
amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
    Finalization: Ok
    Report: Ok

HW event-based analysis with uncore events (Perf)
Example of analysis types: Memory Access
    Collection: Ok
amplxe: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel modules symbols.
amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
    Finalization: Ok
    Report: Ok

HW event-based analysis with stacks (Perf)
Example of analysis types: Hotspots with knob sampling-mode=hw and knob enable-stack-collection=true, etc.
    Collection: Ok
amplxe: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel modules symbols.
amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
    Finalization: Ok
amplxe: Warning: Cannot locate debugging information for file `/lib/x86_64-linux-gnu/libgcc_s.so.1'.
    Report: Ok

HW event-based analysis with context switches (Perf)
Example of analysis types: Threading with knob sampling-and-waits=hw
    Collection: Ok
amplxe: Warning: For analyses using the Perf-based driverless collection, the preemption and synchronization context switches may not be differentiated on kernels older than 4.17. To identify the context switch types on such kernels, switch to the driver-based collection by setting the Stack size option to the unlimited (0) value.
amplxe: Warning: Access to /proc/kallsyms file is limited. Consider changing /proc/sys/kernel/kptr_restrict to 0 to enable resolution of OS kernel and kernel modules symbols.
amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
    Finalization: Ok
    Report: Ok

The system is ready to be used for performance analysis with Intel VTune Amplifier.
Review warnings in the output above to find product limitations, if any.

But I had to adjust the value in the file perf_event_paranoid to 0

Don't know really what this is needed for, but just found it in a previous run as output:

Please set the /proc/sys/kernel/perf_event_paranoid value to 0 or less to continue without installing the drivers.

 

Viewing all 1347 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>