Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all 1347 articles
Browse latest View live

How to understand Retired Instructions at assembly code level?

$
0
0

Hi

I'm trying to understand vTunes metric "Retired insrtuctions". I've read some previous questions/answers in this forum but don't find the answer to my question. I'm going to explain my question using an example. Here is a small piece of code from a performance tuning I did:

(This is C# code) As can be seen, there are 144 vs 999 million instructions retired on row 67 and 68. The loop should make each of the line execute as many times. So far so good since each lines contains several instructions. However, when I show the assembly code for row 67 for instance, this is where I get lost. It shows the following instructions with the following instructions retired count. 27, 57, and 60 adds up to 144 but why are there a different count for each instruction?

 

I guess the first greyed out mov instruction has some uncertainty since it is greyed out. (?) But the other two. Why the difference? Is it due to speculation? But if so, should not the imul intructions have more counts than the add instruction. Unless execution is not actually in this order. Some advice on how to interpret these numbers, please.

 

 


Kernel panic on el6

$
0
0
Hi,

User tries to run memory analysis and compute node immediately panics when job starts.
srun amplxe-cl -collect memory-access -knob analyze-mem-objects=true -knob analyze-openmp=true ./Elmfire-Dev

(it's a hybdir MPI/OpenMP app, resource manager is Slurm)

BUG: unable to handle kernel paging request at 000000000000100c
IP: [<ffffffffa0c898a6>] OUTPUT_Reserve_Buffer_Space+0x26/0x190 [sep4_0]
PGD 1017096067 PUD 1017095067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_cur_freq
CPU 1
Modules linked in: vtsspp(U) sep4_0(U) socperf2_0(U) pax(U) lmv(U) fld(U) mgc(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic crc32c_intel libcfs(U) cpufreq_ondemand freq_table pcc_cpufreq rdma_ucm(U) ib_ucm(U) rdma_cm(U) iw_cm(U) configfs ib_uverbs(U) ib_umad(U) mlx5_ib(U) mlx5_core(U) mlx4_en(U) ipmi_devintf iTCO_wdt iTCO_vendor_support power_meter acpi_ipmi ipmi_si ipmi_msghandler serio_raw sg sb_edac edac_core i2c_i801 lpc_ich mfd_core hpilo hpwdt ioatdma igb dca i2c_algo_bit i2c_core ptp pps_core ib_ipoib(U) ib_cm(U) mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) ib_netlink(U) ipv6 mlx4_core(U) mlx_compat(U) ext4 jbd2 mbcache sd_mod crc_t10dif hpsa(U) scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Pid: 4790, comm: amplxe-runss Not tainted 2.6.32-642.15.1.el6.x86_64 #1 HP ProLiant XL230a Gen9/ProLiant XL230a Gen9
RIP: 0010:[<ffffffffa0c898a6>]  [<ffffffffa0c898a6>] OUTPUT_Reserve_Buffer_Space+0x26/0x190 [sep4_0]
RSP: 0018:ffff88101968b838  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88101602e440 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 00000000000000c0 RDI: ffff88101602e440
RBP: ffff88101968b858 R08: 0000000000000000 R09: 00000000000011d6
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000000000c0 R15: ffff88101968b8b8
FS:  00007ff33a385700(0000) GS:ffff880028220000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000100c CR3: 0000001017093000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process amplxe-runss (pid: 4790, threadinfo ffff881019688000, task ffff88101d5a6040)
Stack:
 ffff88101968b858 ffff88101602e458 0000000000000000 00000000000000c0<d> ffff88101968b898 ffffffffa0c89a62 00000000000002f8 ffff88101968b8b8<d> 0000000000000003 ffff88101968b908 00000000000012b6 ffff88101968bb39
Call Trace:
 [<ffffffffa0c89a62>] OUTPUT_Module_Fill+0x52/0x90 [sep4_0]
 [<ffffffffa0c88544>] linuxos_Load_Image_Notify_Routine+0x174/0x220 [sep4_0]
 [<ffffffffa0c886fe>] linuxos_VMA_For_Process+0x10e/0x1a0 [sep4_0]
 [<ffffffff810097cc>] ? __switch_to+0x1ac/0x340
 [<ffffffffa0c887f4>] linuxos_Enum_Modules_For_Process+0x64/0xc0 [sep4_0]
 [<ffffffffa0c888ba>] linuxos_Exit_Task_Notify+0x6a/0x70 [sep4_0]
 [<ffffffff8154f385>] notifier_call_chain+0x55/0x80
 [<ffffffff810acf2a>] __blocking_notifier_call_chain+0x5a/0x80
 [<ffffffff810acf66>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff810b0e8a>] profile_task_exit+0x1a/0x20
 [<ffffffff8108175b>] do_exit+0x2b/0x870
 [<ffffffff81081ff8>] do_group_exit+0x58/0xd0
 [<ffffffff81097e06>] get_signal_to_deliver+0x1f6/0x460
 [<ffffffff8100a285>] do_signal+0x75/0x870
 [<ffffffff810abc82>] ? hrtimer_cancel+0x22/0x30
 [<ffffffff8154b2b3>] ? do_nanosleep+0x93/0xc0
 [<ffffffff810abd54>] ? hrtimer_nanosleep+0xc4/0x180
 [<ffffffff810bd99b>] ? sys_futex+0x7b/0x170
 [<ffffffff8100ab10>] do_notify_resume+0x90/0xc0
 [<ffffffff8100b3a1>] int_signal+0x12/0x17
Code: 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 0f 1f 44 00 00 48 8b 05 a8 16 01 00 48 89 fb 41 89 d5 <44> 8b 80 0c 10 00 00 45 85 c0 0f 85 3a 01 00 00 8b 43 1c 39 f0
RIP  [<ffffffffa0c898a6>] OUTPUT_Reserve_Buffer_Space+0x26/0x190 [sep4_0]
 RSP <ffff88101968b838>
CR2: 000000000000100c
BUG: unable to handle kernel
---[ end trace c34e52112c8c3565 ]---

 

BSOD with vtss.sys

$
0
0

I'm regularly experiencing BSODs (almost every day) due to a "System Service Exception" involving vtss.sys on Windows 10. They seem to happen randomly and not when I'm using vtune amplifier. The last three minidumps and some system information are here.

Thanks in advance for your help.

Data Collection Failed

$
0
0

I tried to use VTune to analyze TensorFlow (basically Python + C++) workload, and the python process died as soon as as I stared data collection (by pushing the Start button). The python process died with the message:

AMPLXE_TPSSCOLLECTOR: init1226: attach_notification_result == tpss_er_success : attach_notification_result = 14
Assertion failed: init1226: attach_notification_result == tpss_er_success : attach_notification_result = 14. Please contact the technical support. Segmentation fault (core dumped)

I also the message in VTune:

11435 : ERROR : Stack size provided to sigaltstack is too small. Please increase the stack size to 64K minimum.
11435 : WARNING : Function 'PyEval_EvalFrameEx' can be analyzed incorrectly because it uses indirect branch instructions. 

I increased the stack size, but that didn't change anything.

 

 

 

Zone: 

Thread Topic: 

Bug Report

compatibility Beignet project driver and Vtune

$
0
0

Hello,

does anyone knows if it is possible to use Vtune to optimize OpenCL code through the beignet project (in lieu of Intel OpenCL driver) ?

thanks

Philippe

Zone: 

Thread Topic: 

Question

VTune assertion

$
0
0

Hi,

I'm getting the following assertion whenever I try to profile any exe in VTune. This is for a fresh install of VTune from the 19th March.
I've tried building a simple "Hello world" exe with both 32bit and 64bit with the msvc 2015 compiler and 32 bit with mingw 4.9.2. I've tried different analysis types: Basic Hotspots, Advanced Hotspots and Concurrency and get the same result for all.
Does anyone have any ideas?

Thanks,
Chris

[Assertion]

CrashedPID: 12140
CrashedTID: 7716
Expression: entryPoint != 0 && entryPoint != INVALID_ADDRESS
File: C:\bb\INNLphep2w6r\b\b\tmpd2pnho\vcs\perfrun1\plugins\etw\src\kernel\controller.cpp
Line: 376
Product: Intel(R) VTune(TM) Amplifier XE 2017 Update 2; 499904
ReportPath: C:\Users\CHRIS\AppData\Local\Temp\amplxe-log-chris\2017-03-21-17-01-35-785.amplxe-runss.exe\


[Products]
Package ID: N/A
Package Contents: Intel(R) VTune(TM) Amplifier XE 2017 Update 2
Build Number: 499904


[System]

OS Name:                   Microsoft Windows 7 Professional
OS Version:                6.1.7601 Service Pack 1 Build 7601
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Member Workstation
OS Build Type:             Multiprocessor Free
Registered Organization:   Microsoft
Original Install Date:     05/12/2013, 08:33:36
System Boot Time:          21/03/2017, 12:27:45
System Manufacturer:       Dell Inc.
System Model:              OptiPlex 9020
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 60 Stepping 3 GenuineIntel ~3401 Mhz
BIOS Version:              Dell Inc. A16, 18/05/2016
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume2
System Locale:             en-gb;English (United Kingdom)
Input Locale:              en-gb;English (United Kingdom)
Time Zone:                 (UTC+00:00) Dublin, Edinburgh, Lisbon, London
Total Physical Memory:     24,484 MB
Available Physical Memory: 9,868 MB
Virtual Memory: Max Size:  48,966 MB
Virtual Memory: Available: 33,559 MB
Virtual Memory: In Use:    15,407 MB
Page File Location(s):     C:\pagefile.sys

Thread Topic: 

Help Me

VTune 0x40000024 error for Locks and Waits

$
0
0

Hi, guys.

I'm trying to run Locksa and Waits analysys but I've got an 0x40000024 error for my project and Intel demo project both. I tried to find something about this issue using this forum but haven't made it yet due to lack of the information about this mode.

Let me specify my configuration:

C++ code, Visual studio 13, VTune XE 2016 update 4 (470476). I've installed the software with an installer, run the proper .bat file in the Intel VTune folder and set up debugging settings in the release mode.

I'm trying to run  Locks and Waits analysis in paused mode (but I tried to simply run it as well) as result Vtune doesn't run my application but starts the Finalize stage immediately and generates the 0x40000024 error as a result.
Unfortunately I'm working with huge legacy code and my aim is to parallelize it with the OpenMP. I found that some locks obviously exist and I need the tool to speed up their localization. Please advise.

Best regards, Maxim

Zone: 

Thread Topic: 

Help Me

VTune Amplifier's running can not end.

$
0
0

I used Amplifier try analysis a sample program. but the Amplifier can not end when it start analysis.

My program is very sample.

Cmd line output some error information: Xlib: extension "RANDR" missing on display 127.0.0.1

Amplifier version is : 2017 r2

My license is professional edition.


How to change directory for VTune in command line

$
0
0

I am attempting to collect VTune data via command line but it seems to want to store information in a temporary directory and I can figure the option to tell it to use a different directory (insufficient memory in the VTune selected default directory).

I first ran like this:

amplxe-cl -collect <analysis type> ./a.out 

But I got errors about directory so I tried this: 

amplxe-cl -collect <analysis type> -result-dir ./dir1 -user-data-dir ./dir2 ./a.out 

However, I get the same error either way: 

amplxe: Error: There is not enough free disk space available in the `/tmp/amplxe-tmp-drmackay/amplxe-res-66436-9349951234026312/data.0' directory. The free disk space is less than 100 MB.

How can I tell VTune not use /tmp/amplxe-tmp-. . .  but instead to use dir1 or dir2 or someother directory?  I don't have permissions to increase directory size on /tmp

Thank you kindly.

 

Thread Topic: 

Help Me

determining SEP version or build date

$
0
0

I entered command 'sep -version' on my KNL and see this:

Sampling Enabling Product version: 4.0 built on Feb  7 2017 02:38:12
SEP User Mode Version: 4.0.0
SEP Driver Version: 4.0.0
PAX Driver Version: 1.0.1
Platform type: 102
CPU name: Intel(R) Processor code named Knights Landing
PMU: knl
Sampling interrupt mode: Maskable
Copyright (C) 2007-2016 Intel Corporation. All rights reserved.
Application 4177708 resources: utime ~0s, stime ~0s, Rss ~5272, inblocks ~0, outblocks ~0

Can you tell me if this was built from the 2017 Update 2 sources?  Our admins will not tell me which 2017 version's sources they used to build this driver and I need to know if this is the Update 2 version of SEP

Thanks

Ron

Remote client?

$
0
0

 

I am able to generate results with VTune and Advisor on a KNL development system that we have using command line invocation. When I try to load the results into the GUI over two ssh hops the data will not load and VTune and Advisor will just hang if the latency is too long. I desperately need a linux client that does not require a license so that I can locally view results that were generated on a system with a license. I have been trying to get access to a trial version of parallel studio to get me through to a deadline but when I request it I never get an email telling me how to download it. Can someone please help me out?

Thanks.

jgw

Thread Topic: 

Question

memory access fails on knl

$
0
0

I am trying to run VTune memory access analysis on KNL.  I have tried this on two different KNL systems - run by different organizations and it fails the same way on both systems - so it is not unique to one system or organization.   I run by entering:

amplxe-cl -collect memory-access -result-dir dir1 -user-data-dir dir2 ./a.out

When I do this I get the error:

amplxe: Warning: To profile kernel modules during the session, make sure they are available in the /lib/modules/kernel_version/ location.
amplxe: Error: Failed to execute sep process. Data collection is interrupted.
amplxe: Internal Error

(I am not interested in profiling the kernel modules.  I will be content with just analyzing my code - is it such that without profiling kernel modules memory access analysis can not be done?)

To verify that sep is loaded I followed advice from other forum thread and ran lsmod | grep.  Here is the output from that command:

lsmod | grep sep
sep4_0                767375  516
socperf2_0             33414  1 sep4_0

Here is information about VTune version:

amplxe-cl -version
Intel(R) VTune(TM) Amplifier XE 2017 Update 2 (build 499904) Command Line Tool
Copyright (C) 2009-2017 Intel Corporation. All rights reserved.

 

Thread Topic: 

Bug Report

SEP (2017.1.0.486011) does not build on linux-4.10.8

$
0
0

I tried to get the SEP drive compiled - un fortunately it seams that it doesn't compile for Linux kernel 4.10.8.

Does some updated sources exist?

/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c: In Funktion »vtss_user_vm_page_pin«:
/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:290:72: Warnung: Übergabe des Arguments 4 von »get_user_pages« erzeugt Zeiger von Ganzzahl ohne Typkonvertierung [-Wint-conversion]
         rc = vtss_get_user_pages(this->m_task, this->m_mm, addr, 1, 0, 1, &this->m_page, &this->m_vma);
                                                                        ^
/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:270:115: Anmerkung: in Definition des Makros »vtss_get_user_pages«
 #define vtss_get_user_pages(task, mm, start,nr_pages,write,force, pages,vmas) get_user_pages(start,nr_pages,write,force, pages,vmas)
                                                                                                                   ^~~~~
In file included from /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.h:34:0,
                 from /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:34:
./include/linux/mm.h:1271:6: Anmerkung: »struct page **« erwartet, aber Argument hat Typ »int«
 long get_user_pages(unsigned long start, unsigned long nr_pages,
      ^~~~~~~~~~~~~~
/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:290:75: Fehler: Übergabe des Arguments 5 von »get_user_pages« von inkompatiblem Zeigertyp [-Werror=incompatible-pointer-types]
         rc = vtss_get_user_pages(this->m_task, this->m_mm, addr, 1, 0, 1, &this->m_page, &this->m_vma);
                                                                           ^
/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:270:122: Anmerkung: in Definition des Makros »vtss_get_user_pages«
 #define vtss_get_user_pages(task, mm, start,nr_pages,write,force, pages,vmas) get_user_pages(start,nr_pages,write,force, pages,vmas)
                                                                                                                          ^~~~~
In file included from /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.h:34:0,
                 from /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:34:
./include/linux/mm.h:1271:6: Anmerkung: »struct vm_area_struct **« erwartet, aber Argument hat Typ »struct page **«
 long get_user_pages(unsigned long start, unsigned long nr_pages,
      ^~~~~~~~~~~~~~
/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:270:79: Fehler: zu viele Argumente für Funktion »get_user_pages«
 #define vtss_get_user_pages(task, mm, start,nr_pages,write,force, pages,vmas) get_user_pages(start,nr_pages,write,force, pages,vmas)
                                                                               ^
/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:290:14: Anmerkung: bei Substitution des Makros »vtss_get_user_pages«
         rc = vtss_get_user_pages(this->m_task, this->m_mm, addr, 1, 0, 1, &this->m_page, &this->m_vma);
              ^~~~~~~~~~~~~~~~~~~
In file included from /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.h:34:0,
                 from /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.c:34:
./include/linux/mm.h:1271:6: Anmerkung: hier deklariert
 long get_user_pages(unsigned long start, unsigned long nr_pages,
      ^~~~~~~~~~~~~~
cc1: Einige Warnungen werden als Fehler behandelt
make[3]: *** [scripts/Makefile.build:295: /opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp/user_vm.o] Fehler 1
make[2]: *** [Makefile:1490: _module_/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp] Fehler 2
make[2]: Verzeichnis „/usr/lib/modules/4.10.8-1-ARCH/build“ wird verlassen
make[1]: [Makefile:236: all] Fehler 2 (ignoriert)
cp -f vtsspp.ko vtsspp-x32_64-4.10.8-1-ARCHsmp.ko
cp: der Aufruf von stat für 'vtsspp.ko' ist nicht möglich: Datei oder Verzeichnis nicht gefunden
make[1]: [Makefile:237: all] Fehler 1 (ignoriert)
make[1]: Verzeichnis „/opt/intel/vtune_amplifier_xe_2017.1.0.486011/sepdk/src/vtsspp“ wird verlassen

 

TSX sampling of aborts fails

$
0
0

I was able to collect data of TSX sampling (option '1. Transactionl success') - but if I select option '2. Aborts' Vtune gives me an error:

Error 0x40000024 (No data) no data is collected ...

I use the same binary (application using TSX) for both cases. Why is Vtune unable to collect data for TSX aborts?

Oliver

TSX hotspot report with total thread count 1?

$
0
0

I'm using Vtune 2017 for TSX eploration.

I'm wondering that the TSX hotspot report shows a total thread count of 1, but the application uses 32 threads (==logical cores).

The other TSX basic exploration report (TSC success/abort rate) displays correctly 'total thread count: 32'.

Oliver


Unable to resolve function names and call stack

$
0
0

Hi, I have created my project with CMake and now I'm trying to profile it using VTune2017 and Visual Studio 15. I've set up the Microsoft Symbol Server as well. However, I'm unable to resolve all of the function names and there is no call stack information available for any of the functions.

I'm still getting these warnings, even after I tried to manually download the debug symbols from Microsoft and it seems that this is at least the reason for the missing function names. How do I get the right debugging symbols?

I'm building with debug information ofc and as I said, I've setup the debugging symbols.

I had all of this working once, but I formatted my computer and now it's not anymore. I also posted this on stackoverflow, but since no one has replied there, yet, I thought I might give it a try here as well.

http://stackoverflow.com/questions/43365235/intel-vtune-unable-to-locate-all-windows-debug-symbols

I'd appreciate any help!

Thread Topic: 

Help Me

Cannot find amplxe-gui or cl to open Vtune

$
0
0

Hello,

I have just downloaded Parallel Studio 2017 Cluster Edition for CentOS 7, and after the Gui installation, a followed the step guide to run Vtune, based on the Getting Started html file. However, when it asks to run amplxe-gui, i cannot find it!

Anybody can help me with this?

Thanks for the attention!

Rgds.

Diego Menescal

Join the Intel® Parallel Studio XE 2018 Beta program

$
0
0

We would like to invite you to participate in the Intel® Parallel Studio XE 2018 Beta program. In this beta test, you will gain early access to new features and analysis techniques. Try them out, tell us what you love and what to improve, so we can make our products better for you. 

Registration is easy. Complete the pre-beta survey, register, and download the beta software:
Intel® Parallel Studio XE 2018 Pre-Beta survey

The 2018 version brings together exciting new technologies along with improvements to Intel’s existing software development tools:

Modernize Code for Performance, Portability and Scalability on the Latest Intel® Platforms

  • Use fast Intel® Advanced Vector Extensions 512 (Intel®AVX-512) instructions on Intel® Xeon® and Intel®Xeon® Phi™ processors and coprocessors
  • Intel® Advisor - Roofline finds high impact, but under optimized loops
  • Intel® Distribution for Python* - Faster Python* applications
  • Stay up-to-date with the latest standards and IDE:
    • C++2017 draft parallelizes and vectorizes C++ easily using Parallel STL*
    • Full Fortran* 2008, Fortran 2015 draft
    • OpenMP* 5.0 draft, Microsoft Visual Studio* 2017
  • Accelerate MPI applications with Intel® Omni-Path Architecture

Flexibility for Your Needs

  • Application Snapshot - Quick answers:  Does my hybrid code need optimization?
  • Intel® VTune™ Amplifier – Profile private clouds with Docker* and Mesos* containers, Java* daemons

 And much more…
For more details about this beta program, a FAQ, and What’s New, visit: Intel® Parallel Studio XE 2018 Beta page.
As a highly-valued customer and beta tester, we welcome your feedback to our development teams via this program at our Online Service Center.

2018 Beta App Perf Snapshot memory stats

$
0
0

The new 2018 Beta Application Performance Snapshot is easy to use.  Question, for the "Memory Footprint" statistics for an MPI job how do I interpret this?  I ran a 16 rank job on 4 nodes, 4 ranks per node.  So is the Mean and Peak an average for 1 rank? Or an average footprint for 1 node? or what?

Ron

Cache Miss in V-Tune

$
0
0

Hi everybody,

I'm a new comer, and I wonder if I post in the right place :)

I work as a developer for a telecommunication system (4G core network). Recently, I am using DPDK (http://dpdk.org/) and have to evaluate the its performance. I found the Intel V-Tune,  has been using it for a couple of days. But V-Tune results do not reach my expectation. For instance:

- I implemented a tutorial "Tutorial: Identifying Hardware Issues" in order to find Cache Miss Rate, but it shows nothing.
- I applied V-Tune to run a DPDK example (l2fwd), and got the same result.

My environment:
- Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz
- Centos OS 7.2.1511

I expect to get "CPI Rate", "LLC Miss", "Branch Mispredict"... because they are important parameters for me, affect to my decision to buy V-Tune or not.

I don't know how to contact any Intel software supporter. So I appreciate anyone here to help me to get above result as they show in the guide (hw_issues_amplxe_lin.pdf)

Thanks in advance,
Tuan Anh PHAM

 

Thread Topic: 

Help Me
Viewing all 1347 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>