Estimating elapsed time for a vtune anaysis (knob sampling-interval)

Hi,

I ran a HPCPerformance analysis(vtune 2020u0) on intel8280 (RHEL7.6) with default settings as -

time mpirun -np $SLURM_NPROCS -ppn $SLURM_NTASKS_PER_NODE  $OPTS  amplxe-cl -collect hpc-performance -data-limit 0 -result-dir result_hpcperf -- ${APP_INSTALL_ROOT}/appname.exe

the analysis part

vtune: Executing actions  0 %
........
vtune: Executing actions 100 % done

took around 45 minutes and "result_hpcperf.nodeXX" directory had around 20G data.

Q1: If my linux kernel version is 3.10.0-957.el7.x86_64 then what will be the default sampling interval ?

Q2: If i reduce the sampling interval for an analysis by half, (by rough estimate) how much elapsed time and output data should i expect for the vtune analysis+report generation part ?

- I was expecting that if the sampling interval is halved (default 1ms -> 0.5ms ) , then the analysis & result generation should take around 90 minutes and i was expecting data of around 40-50 GB. Please let me know if my assumptions are incorrect.

Q3: Also, If i reduce the sampling interval for an analysis by half, then (in general based on your observations with this tool) how much accuracy in output data metrics can i expect ?

As per this article (CPU sampling interval, ms field) , i assumed the default sampling interval should be 1ms, and i reran HPC performance analysis by setting sampling-interval to 0.5 ms as -

time mpirun -np $SLURM_NPROCS -ppn $SLURM_NTASKS_PER_NODE  $OPTS  amplxe-cl -collect hpc-performance -data-limit 0 -result-dir result_hpcperf -knob sampling-interval=0.5  -- ${APP_INSTALL_ROOT}/appname.exe

the last statement to appear in the stdout was -

vtune: Executing actions  0 %

and around 11 hours ave elapsed since then and around 150G of data has been generated in results directory.

within the results directory ( find . -printf "%T+\t%p\n" | sort) i saw that the last file was changed around 11 hours ago , and that file has following contents -

[user@headnode01 hpcperf_char_00003]$ cat result_hpcperf.node3/config/log.cfg
<?xml version='1.0' encoding='UTF-8'?>

<bag xmlns:int="http://www.w3.org/2001/XMLSchema#int" xmlns:long="http://www.w3.org/2001/XMLSchema#long">
 <message_entry_t int:status="2" cap="Data collection completed successfully" msg="" long:timeStamp="1586803953480"/>
 <message_entry_t int:status="2" cap="Data collection completed successfully" msg="" long:timeStamp="1586803953542"/>
 <message_entry_t int:status="2" cap="Data collection completed successfully" msg="" long:timeStamp="1586803953687"/>
 <message_entry_t int:status="2" cap="Data collection completed successfully" msg="" long:timeStamp="1586803953748"/>
 <message_entry_t int:status="2" cap="Data collection completed successfully" msg="" long:timeStamp="1586803954281"/>
 <message_entry_t int:status="1" cap="Data collection completed with warnings" msg="Please see warning messages for details. " long:timeStamp="1586809230671">
  <message msg="Analyzing data in the node-wide mode. The hostname (node61) will be added to the result path/name." int:severity="1"/>
  <message msg="Peak bandwidth measurement started." int:severity="1"/>
  <message msg="Peak bandwidth measurement finished." int:severity="1"/>
  <message msg="To enable hardware event-base sampling, VTune Profiler has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes." int:severity="2"/>
  <message msg="Collection started." int:severity="1"/>
  <message msg="Collection stopped." int:severity="1"/>
 </message_entry_t>
</bag>

also, on the compute node (node3) i checked the running processes via top command -

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
127588 root      20   0 4128520  82480   3308 R 100.0  0.0 563:13.50 sep
    10 root      20   0       0      0      0 S   6.2  0.0   0:22.52 rcu_sched
     1 root      20   0   56068   8276   2620 S   0.0  0.0   0:26.51 systemd

Here also , it seems that the sep command(/driver)has been running since ~9hours with no memory utilization. Not sure if the application/sep driver is running fine. Is there a way to confirm (via system logs/sep driver logs) if the application is running fine?

It would be very helpful for me if i could get an estimate of the time to be taken by this analysis to finish in my scenario?

- Asking as i will adjust the "walltime" for my vtune jobs on my cluster accordingly.

Please let me know if i can provide more information from my end to help you with answers to my queries.

Estimating elapsed time for a vtune anaysis (knob sampling-interval)

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112