Quantcast
Channel: Intel® VTune™ Profiler (Intel® VTune™ Amplifier)
Viewing all articles
Browse latest Browse all 1347

100% Core bound

$
0
0

I am new to VTune and I am using it to analyse a simple algorithm like finding min and max of a large array. I used the microarchitecture exploration mode hoping to investigate if it is memory bound or front end bound. So in this situation, what does it mean to be 100% core bound? Or more likely, what am I doing wrong while measuring? Sorry if I am asking a stupid question and thank you for taking the time to read my question.

Other info:

Linux kernel version: 4.14.81.bm.21-amd64

LSB: Debian GNU/Linux 9.12 (stretch)

Model name: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz (Skylake)

Code I am running, the source array can go up to 1GB (== source size of 2^28 as each data is 4 bytes): 

__attribute__((noinline))
void findMinMax(int32_t *src, uint32_t src_size, int32_t &min, int32_t &max) {
	min = src[0];
	max = src[0];
	for (int i = 1; i < src_size; i++) {
		auto current = src[i];
		if (current < min) {
			min = current;
		}
		if (current > max) {
			max = current;
		}
	}
}

The executable is also compiled using GCC 9.3 with -O3 flag. I also have tried different levels of optimisation flags, including no optimisation flag, and all of them yield 100% core bound.

I also have tried on various other simple algorithms and they seem to give the same 100% core bound issue.


Viewing all articles
Browse latest Browse all 1347

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>