Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all articles
Browse latest Browse all 1347

How to measure FLOPS using vTunes?

$
0
0

I have a problem understanding some of the counters reported by vTunes XE 2013 in order to calculate FLOPS according to this Intel article. Using the vTunes counters I get approximately 3 times as high values than the manually counter value (and it should not be due to speculation I think). Here’s the details of this:

I believe there should be two float operations (MULTIPLY and ADD) here disregarding that the float comparison on the return row. (Is this assumption correct?). Counting the number of iterations in the loop an multiplying it with 2 and divide by elapsed time gives a FLOPS value of 0,33 GFLOPS on this 12 core machine (running 12 threads). However, when using the excellent article describing how to collect the metrics using vTunes, I get a much higher value: 1,06 GFLOPS. There are branches also in another method with the loop, but test data is setup so that branches always render the same decision and the HasRemaingBudget function is always run. I.e. I think branch prediction should be extremly accurate not giving any extra overhead. 

Scrutinizing the vTune profilation assmebly code for the above C# code gives us the following metric:

For row 43 (Order order = ad.Order):

According to the article I referenced to above, this counter should be included in the FLOPS calculation. But I don’t understand what the floating point operation is here. It seems like a straight forward move of data from memory to register not involving any floating point calculation at all. Or?

So which is the most accurate way to measure FLOPS in an application? Doing it the simple way and count floating point operations in high level code (C# in this case), or relying on metrics from vTune to capture the cases when the floating point operations seems to be hidden (from non-experts at least)?


Viewing all articles
Browse latest Browse all 1347

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>