Re: slowdown with -std=gnu18 with respect to -std=c99

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



thank you very much Alexander.

> Date: Tue, 3 May 2022 12:09:32 +0300 (MSK)
> From: Alexander Monakov <amonakov@xxxxxxxxx>
> cc: gcc-help@xxxxxxxxxxx, stephane.glondu@xxxxxxxx, sibid@xxxxxxx
> 
> On Tue, 3 May 2022, Paul Zimmermann via Gcc-help wrote:
> 
> > Does anyone have a clue?
> 
> I can reproduce a difference, but in my case it's simply because in -std=gnuXX
> mode (as opposed to -std=cXX) GCC enables FMA contraction, enabling the last few
> steps in the benchmarked function to use fma instead of separate mul/add
> instructions.

but then you should get better (i.e. smaller) timings with -std=gnuXX than
with -std=cXX, instead of worse timings as we get?

> (regarding __builtin_expect, it also makes a small difference in my case,
> it seems GCC generates some redundant code without it, but the difference is
> 10x smaller than what presence/absence of FMA gives)
> 
> I think you might be able to figure it out on your end if you run both variants
> under 'perf stat', note how cycle count and instruction counts change, and then
> look at disassembly to see what changed. You can use 'perf record' and 'perf
> report' to easily see the hot code path; if you do that, I'd recommend to run
> it with the same sampling period in both cases, e.g. like this:
> 
>     perf record -e instructions:P -c 500000 ./perf ...

thank you, we'll investigate that.

Best regards,
Paul



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux