Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all articles
Browse latest Browse all 1347

MEM_TRANS_RETIRED.LOAD_LATENCY_GT* unexpected results.

$
0
0

Hi all,

I am trying to use VTune Amplifier (Linux version) to profile memory access latency. I was using it to get familiar with it by profiling a toy program that just loads a big array of data. I use the command line version like this.

amplxe-cl -collect-with runsa -knob event-config=MEM_TRANS_RETIRED.LOAD_LATENCY_GT_32,MEM_TRANS_RETIRED.LOAD_LATENCY_GT_64 ./load The result I get is the following.

============================================================================

CPU
---
Parameter          r000runsa                      
-----------------  -------------------------------
Name               Intel(R) Xeon(R) E5v2 processor
Frequency          2394229995                     
Logical CPU Count  48                             

Summary
-------
Elapsed Time:  7.757
CPU Usage:     1.000

Event summary
-------------
Hardware Event Type                   Hardware Event Count:Self  Hardware Event Sample Count:Self  Events Per Sample
------------------------------------  -------------------------  --------------------------------  -----------------
CPU_CLK_UNHALTED.REF_TSC                            18538027807                              9269  2000003          
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_32                          0                                 0  100007           
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_64                      24036                                 6  2003             
amplxe: Executing actions 100 % done

=======================================================================

From the explanation of the MEM_TRANS_RETIRED.LOAD_LATENCY_GT_* events, the count of *_GT_32 must be greater that *_GT_64. In this case it is not, and this behavior is reproducible.

I checked the errata published at the specification update and stumbled upon the paragraph BT241 which mentions that "The affected events may undercount, resulting in inaccurate memory profiles" and the list of events contains MEM_TRANS_RETIRED.LOAD_LATENCY.

Can somebody explain why the count of  MEM_TRANS_RETIRED.LOAD_LATENCY_GT_32 is less than MEM_TRANS_RETIRED.LOAD_LATENCY_GT_64 please?

Thank you,

Best Regards, ARam


Viewing all articles
Browse latest Browse all 1347

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>