Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all 1347 articles
Browse latest View live

[Instrumentation Engine]: Attach to pid 190341 failed: Operation not permitted

$
0
0

Hi guys!

I want to use VTune remote .But it doesn't work .The host is windows,the target is CentOS.

I have been set no password SSH sucessfully.No matter which  target type to profile I select ,it report likes that

 

Collection failed

Feb 05 2019 18:48:42 Collection failed. The data cannot be displayed.
Failed to attach to the specified target process. Please make sure the process exists and VTune Amplifier process has enough permissions to attach to the target process. See the Troubleshooting help topic for more details.
[Instrumentation Engine]: Attach to pid 190341 failed: Operation not permitted 

 

 

Give me some advice.

Thank you


Serial number is already registered

$
0
0

I'd like to try Intel vtune to profile an application, time is short to do this work.

Here are the details of the issue -

I received an Intel vtune software evaluation email with a serial# to register.

I attempted to register using the serial# to allow a trial download of vtune.

When I enter the serial# and click the Register button an error message "Serial number is already registered." is displayed.

Now what?

Thank you in advance for your time,

Stewart

Finalization error (createEventInstance: wrong uniqueTid)

$
0
0

Hey,

I am attempting to profile a threaded version of a Fortran simulation using VTune. The threading was done using OpenMP and the Intel Fortran compiler. After completing data collection I get the following error message during finalization: 

Cannot load data file 'C:\Users\...\r001hs\data.0\3740-14804.0.trace' (createEventInstance: wrong uniqueTid)

This leads to inaccurate results showing wrong total wall time or wrong thread count/histogram.

I managed to do a couple of analyses before but, since this started happening, I don't seem to be able to redo them without this error. Also, I don't get this error when I profile the single thread version. I don't know where to start with troubleshooting this so any tips will be appreciated. 

Software and system information:

  • I am using Intel VTune Amplifier GUI 2019 Update 2 Product Build 588069. Analysis type is Hotspots and its running on my local machine. Also, I always run it as admin.
  • Compiler: Intel fortran compiler 19.0 update 1 for the Visual Studio 2017 environment. I compile with the following options: 

/nologo /O2 /fpp /Qopenmp /module:"Release\\" /object:"Release\\" /Fd"Release\vc150.pdb" /libs:static /threads /c /Zi /debug:inline-debug-info

  • My OS is:  64 bit Windows 10 Enterprise version 10.0.17134 Build 17134.
  • My processor is: Intel Core i7-7800X @ 3.50GHz with 12 Logical Processors

 

Regards,

Ghassan
 

 

Python Thread Race Error

$
0
0

Hi all,

upon analyzing even a simple python script with the VTune Amplifier, I get the following python error and no performance analysis: 

Fatal Python error: ceval: tstate mix-up

I used the analysis set up as described in this manual: https://software.intel.com/en-us/vtune-amplifier-help-python-code-analysis

The python script looks as follows:

import numpy as np

def my_func():
    x = np.array([1,2,3])
    y = np.array([1,2,3])

    for i in range(500):
        x += y

if __name__ == '__main__':
    my_func()

Am I missing something? Any help is appreciated.

Thank you!

System specifications:

Win 10
VTune Amplifier 2019 Update 2
IntelPython 2

find integer vectorised ops

$
0
0

It is fairly easy to find the percentage of vectorised floating point operations in an uninstrumented code (no debug symbols) including vector length (AVX or not). How to find if a code is vectorising integer operations? Preferably using HW counters.

More generally, what to look for in a code which is computationally intensive but has relatively little FP ops? Are there suitable templates for initial analysis?

Cannot load data file, getModuleInfo: invalid id(): Remote analysis

$
0
0

I have set up everything according to this manual for remote analysis <https://software.intel.com/en-us/vtune-amplifier-help-linux-system-setup.... I am able to run applications on target and collect data. While resolving the data,  Vtune throws an error

Cannot load data file `/opt/devel/vtune/projects/test/r020hs/data.0/30401-30415.1.trace' (getModuleInfo: invalid id()!).

I did a quick search on google to find this <https://software.intel.com/en-us/vtune-amplifier-help-error-message-cann..., which says that there is not enough storage on my disk. I reran vtune with an alternative temporary directory and got the same error.

Any ideas on how to solve this issue?

~Mark

Cannot load data file, getModuleInfo: invalid id(): Remote analysis

$
0
0

I have set up everything according to this manual for remote analysis <https://software.intel.com/en-us/vtune-amplifier-help-linux-system-setup.... I am able to run applications on target and collect data. While resolving the data,  Vtune throws an error

Cannot load data file `/opt/devel/vtune/projects/test/r020hs/data.0/30401-30415.1.trace' (getModuleInfo: invalid id()!).

I did a quick search on google to find this <https://software.intel.com/en-us/vtune-amplifier-help-error-message-cann..., which says that there is not enough storage on my disk. I reran vtune with an alternative temporary directory and got the same error.

Any ideas on how to solve this issue?

~Mark

Intel Vtune Amplifier for profiling Docker process doesn't work

$
0
0

Hi,

I have started docker process with below command

docker run --privileged=true --cap-add=SYS_PTRACE    -it  <docker image name and other flags>

After starting my docker i go inside docker and running my application....an the same i want to profile.

For profiling i get the pid of my process that is running inside the docker by ps -eaf | grep <my application name>

 I use below command 

$ amplxe-cl -collect hotspots -target-pid=$PID

Once i start the amplxe it kills my running process and give seg fault with below error 

MPLXE_TPSSCOLLECTOR: init:1300: attach_notification_result == tpss_er_success : attach_notification_result = 14
Assertion failed: init:1300: attach_notification_result == tpss_er_success : attach_notification_result = 14.

Is it possible to launch Vtune with docker process? 

 

 


missing libs when running system_analyzer

$
0
0

system_analyzer seems like an interesting tool to try (introduced in 2019 u2, i'm using 2019 u3)

however i'm getting error when trying to run it on CentOS 7

ldd /opt/intel/vtune_amplifier/system_analyzer/target/gpa_router
/opt/intel/vtune_amplifier/system_analyzer/target/gpa_router: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /opt/intel/vtune_amplifier/system_analyzer/target/gpa_router)
/opt/intel/vtune_amplifier/system_analyzer/target/gpa_router: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /opt/intel/vtune_amplifier/system_analyzer/target/gpa_router)
/opt/intel/vtune_amplifier/system_analyzer/target/gpa_router: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /opt/intel/vtune_amplifier/system_analyzer/target/gpa_router)
/opt/intel/vtune_amplifier/system_analyzer/target/gpa_router: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /opt/intel/vtune_amplifier/system_analyzer/target/gpa_router)

thanks

strang loads and stores analysis

$
0
0

I am trying to optimize my code. I use memory access analysis and pick up the top line function in the CPU time order. 

i open the function in source code and assembly code. At every line analysis, i found something strange.

The biggest time consumer line is a sample punpcklbw assembly code and the code have large loads/stores. I think it is impossible, the compute just access a xmm register. The code and analysis has upload as a image.

The block 70 assembly code is "if (left)" c code branch.

The  "punpcklbw  xmm2, xmm2" code do not access memory. So why this line has large loads/stores ?

so who can help me for the analysis result? and how can i decrease the time consume for this code block?

This block is biggest time consumer in the biggest time consumer function.

 

 

 

 

AttachmentSize
Downloadimage/png2181_6141.25.png349.77 KB

omp taskloop fortran 90

$
0
0

Dear Fellows,

 

I am working in code parallelization task which contains nested loops.

I did enhanced the perfromance by converting an inner loop to be parallel

The challenge now is that every itteration for the outter loop the threads are created and destroyed, which adds an overhead.

 

I searched a bit in how to create the thread pool once, and I landed upon taskloop construct in omp with fortran 90.

 

the issue now is that I applied the code as its described in the documentation of omp, but it never go inside the

 

!$omp taskloop
 ....never go here
!$omp end taskloop

I added the

!$omp single

!$omp taskloop

!$omp end taskloop

!$omp end single

but also didn't help at all with going into the code between the "taskloop" start and end statments

 

 

Any suggestion for what do next would be of great help.

 

Sincerely

Unstable test results on remote Linux systems

$
0
0

I am using Intel VTune Amplifier 2019 with update 3 on Linux system, and try to analyze an interaction of CPU and FPGA on remote Linux systems.

On the host system, I select "Remote Linux (SSH)" in the Where pane and use "Attach to Process" in the What pane to specify the process ID (PID) to analyze. In the How pane, I select "CPU/FPGA Interaction".

However, when I click "Start" to begin the analysis, the target system often crashes while processing. Or I can't see any FPGA-related content in the results. I am sure that the FPGA has execution, but why the FPGA is not detected and the results are unstable.

Has anyone tried or encountered a similar problem?

Thank you for considering my request.

VTune cannot get GPU Compute/Media Hotspots info with below error messages.

$
0
0

Hi,

I want to use Intel VTune to get intel Graphics profiling data but failed, with below message:

- Valid setenv symbol is not found in the binary of the analysis target.

- Binary file of the analysis target does not contain symbols required for profiling. See the 'Analyzing Statically Linked Binaries' help topic for more details.

- Assertion failed: object_impl:35 (obj != ((void *)0)): Please contact the technical support.

 

I have tried several methods but no one works.

It's urgent request, please reply soon.

 

Thanks

Danyu

Where in report GUI can I find cpu frequency information?

$
0
0

"Configure Analysis" has the option "Collect CPU frequency data" at the bottom.

But where can I find the frequency data in the analyzed report?

OpenCL GPU In-kernel profiling showing all time on kernel declaration

$
0
0

I'm trying to using VTune's GPU In-Kernel Profiling feature in order to determine optimize an OpenCL kernel running on an Intel GPU (the HD Graphics 620 built into an i5-7300U.)

I initially installed VTune Amplifier 2019 Update 3, but then realized that apparently OpenCL GPU in-kernel profiling has been temporarily removed as of that update "to address some defects" (according to a note in the Intel docs here: https://software.intel.com/en-us/vtune-amplifier-help-gpu-in-kernel-prof...).

Since it was removed from that version, I uninstalled it and installed 2019 Update 2 instead. I can now get the in-kernel profiling feature to launch and display the source code properly, but all of the time for the kernel is being displayed on the line that contains only the kernel declaration. Obviously, that's not particularly helpful for optimizing the kernel.

Here's a screenshot of what I'm seeing with the first few lines of the kernel, but none of the other lines of the kernel show any time, either. Only the kernel declaration line shows any time (and it's the total execution time for the kernel.)

Is this a known bug in VTune Amplifier 2019 Update 2 or am I just doing something wrong? Is there something in particular I need to enable in order to get the profiler to have line-by-line resolution on the execution times rather than only showing total execution time for the entire kernel?


amplxe: Error: Ftrace is already in use.

$
0
0

Hi,
i am trying to run vtune amplifier 2019u2 to collect system-overview as - 

export NPROCS=36 
export OMP_NUM_THREADS=1 
mpirun -genv OMP_NUM_THREADS $OMP_NUM_THREADS -np $NPROCS  amplxe-cl -collect system-overview  -result-dir /home/puneet/run_node02_impi2019_profiler_systemoverview/profiles/attempt1_p${NPROCS}_t${OMP_NUM_THREADS}  -quiet $INSTALL_ROOT/main/wrf.exe

I had collected hpc-performance data without any issue. Afterwards , i ran aforementioned command but had to kill it (result dir was incorrect.). when i re-ran the amplxe-cl, i am getting following error messages - 
 

amplxe: Error: Ftrace is already in use. Make sure to stop previous collection first. 
amplxe: Error: Ftrace is already in use. Make sure to stop previous collection first. 
amplxe: Error: Ftrace is already in use. Make sure to stop previous collection first.

I have tried deleting the /home/puneet/run_node02_impi2019_profiler_systemoverview/profiles/* and i have also rebooted the node.
even then those error messages are showing up.

Then on same node i ran general-exploration , and though there are some warning messages on stdout, the collection seems to be working fine - 

amplxe: Warning: The analysis type 'general-exploration' is deprecated. Use 'uarch-exploration' analysis type instead. See more details with 'amplxe-cl -help collect uarch-exploration'.
amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
 starting wrf task            2  of           40
....

This seems to be an issue only with "system-overview" profile
Please advice.

GPU Concurrency

$
0
0

i'm trying to analysis the openVINO object_detection_demo_ssd_async sample,

Using VTune Amplifier CPU/GPU Concurrency on i7-7700 platform

 

There was some warning message shows

Collection of GPU usage is not possible due to a lack of credentials. Make sure you have read/write access to debugFS. You may either run the analysis with root privileges (recommended) or following configuration instructions provided in the software event library help topic

Collection of context switches is not possible due to a lack of credentials.Make sure you have read/write access to debugFS. You may either run the analysis with root privileges (recommended) or following configuration instructions provided in the software event library help topic

So i copy the script and execute with root privileges

amplxe: Warning: GPU usage collection requires driver trace-points to be enabled in the driver kernel module. Make sure the kernel configuration option is set as CONFIG_DRM_i915_LOW_LEVEL_TRACEPOINTS=y.

and i tried to Rebuild and Install the Kernel, follow the step  on Intel® VTune™ Amplifier 2019 User Guide

But when building, I got failed

# make -C tools/ objtool
make: Entering directory '/usr/src/linux-headers-4.15.0-46-generic/tools'
  DESCEND  objtool
make[1]: Entering directory '/usr/src/linux-headers-4.15.0-46-generic/tools/objtool'
make -C /usr/src/linux-headers-4.15.0-46-generic/tools/build CFLAGS= LDFLAGS= fixdep
make[2]: Entering directory '/usr/src/linux-headers-4.15.0-46/tools/build'
/usr/src/linux-headers-4.15.0-46-generic/tools/build/Makefile.build:37: /usr/src/linux-headers-4.15.0-46-generic/tools/build/Build.include: No such file or directory
make[3]: *** No rule to make target '/usr/src/linux-headers-4.15.0-46-generic/tools/build/Build.include'.  Stop.
Makefile:43: recipe for target 'fixdep-in.o' failed
make[2]: *** [fixdep-in.o] Error 2
make[2]: Leaving directory '/usr/src/linux-headers-4.15.0-46/tools/build'
/usr/src/linux-headers-4.15.0-46-generic/tools/build/Makefile.include:4: recipe for target 'fixdep' failed
make[1]: *** [fixdep] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-4.15.0-46-generic/tools/objtool'
Makefile:63: recipe for target 'objtool' failed
make: *** [objtool] Error 2
make: Leaving directory '/usr/src/linux-headers-4.15.0-46-generic/tools'

So

  1. How to run VTune GUI with root privileges?
  2. How to build kernel? why i got that error?
  3. Is it the way to solution?

Licensing options for VTune

$
0
0

Hi,

It seems to me that Intel® VTune Amplifier has two licensing models: one paid (1-seat floating license $3149 or 1-seat named-user license $899) and one free (90-days community license, refreshable an unlimited number of times, inside  Intel System Studio ). The principal difference between them is the type of support provided (Intel Priority Support on the paid license, internet community support on the free license), as far as I can see.

It seems weird for Intel to offer the same tool both for free (in a software bundle) and for lots of money (as a standalone package), so I have a feeling I'm missing something here. I'd like to use the tool in the development of our commercial product, but I wouldn't like to ask my boss to fork over $$$ for that. Also, it appears that the community license should not be confused with the academic license, although both are free.

Does anybody know what are the limits for the community license as applicable to a regular company? Am I allowed to use this tool for free?

Thanks!

amplxe-gui - problem with libdbus-1.so.3

$
0
0

Hi, I would like to use amplxe-gui. However whenever I execute it I receive this strange information:
 

$ amplxe-gui
amplxe-gui: /lib/x86_64-linux-gnu/libdbus-1.so.3: no version information available (required by amplxe-gui)
/data00/intel/vtune_amplifier_2019.3.0.590814/bin64/amplxe-gui: /lib/x86_64-linux-gnu/libdbus-1.so.3: no version information available (required by /data00/intel/vtune_amplifier_2019.3.0.590814/bin64/amplxe-gui)

 

I have already installed following things:

$ sudo apt install libdbus-glib-1-dev dbus libdbus-1-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
dbus is already the newest version.
libdbus-1-dev is already the newest version.
libdbus-glib-1-dev is already the newest version.
The following packages were automatically installed and are no longer required:
  libnfsidmap2 libtirpc1 libuuid-perl
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 150 not upgraded.

 

My OS version is:

$ uname -a
Linux n15-038-220 4.9.0-0.bpo.5-amd64 #1 SMP Debian 4.9.65-3+deb9u2~bpo8+12 (2018-02-08) x86_64 GNU/Linux

 

Please help!

vtune community edition not working for skylake

$
0
0

Hi,

 I just downloaded and installed the free version of vtune on my skylake based server. I want to profile the CPU utilizatio for my workloads. Below is error log i m  getting when verifying the installation which suggest that there may be an issue with support for Skylake. Can you please confirm what may be the issue. 

Here are the cpu details.

CPU family:            6
Model:                 85
Model name:            Intel Xeon Processor (Skylake, IBRS)

Thanks,

Amar

Intel(R) VTune(TM) Amplifier Self Check Utility
Copyright (C) 2009-2017 Intel Corporation. All rights reserved.
Build Number: 590814

Instrumentation based analysis check
Example of analysis types: Hotspots with default knob sampling-mode=sw, Threading
    Collection: Fail
amplxe: Error: Basic Hotspots analysis is not supported on this platform. Please see the software/hardware requirements in the product Release Notes.
amplxe: Warning: Hardware collection of CPU events is not possible on this system. Microarchitecture performance insights will not be available.

HW event-based analysis check
Example of analysis types: Hotspots with knob sampling-mode=hw, HPC Performance Characterization, etc.
    Collection: Fail
amplxe: Error: This analysis type is not applicable to the system because VTune Amplifier cannot recognize the processor. If this is a new Intel processor, please check for an updated version of VTune Amplifier. If this is an unreleased Intel processor, please contact Online Service Center for an NDA product package.
amplxe: Error: This analysis type is not applicable to the current machine microarchitecture.

HW event-based analysis check
Example of analysis types: Microarchitecture Exploration
    Collection: Fail
amplxe: Error: This analysis type is not applicable to the system because VTune Amplifier cannot recognize the processor. If this is a new Intel processor, please check for an updated version of VTune Amplifier. If this is an unreleased Intel processor, please contact Online Service Center for an NDA product package.
amplxe: Error: This analysis type is not applicable to the current machine microarchitecture.

HW event-based analysis with uncore events
Example of analysis types: Memory Access
    Collection: Fail
amplxe: Error: This analysis type is not applicable to the system because VTune Amplifier cannot recognize the processor. If this is a new Intel processor, please check for an updated version of VTune Amplifier. If this is an unreleased Intel processor, please contact Online Service Center for an NDA product package.
amplxe: Error: This analysis type is not applicable to the current machine microarchitecture.

HW event-based analysis with stacks
Example of analysis types: Hotspots with knob sampling-mode=hw and knob enable-stack-collection=true, etc.
    Collection: Fail
amplxe: Error: This analysis type is not applicable to the system because VTune Amplifier cannot recognize the processor. If this is a new Intel processor, please check for an updated version of VTune Amplifier. If this is an unreleased Intel processor, please contact Online Service Center for an NDA product package.
amplxe: Error: This analysis type is not applicable to the current machine microarchitecture.

The check observed a product failure on your system.
Review errors in the output above to fix a problem or contact Intel technical support.

 

Viewing all 1347 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>