Re: surprising optimization of comparison operations for __int128_t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/09/2010 09:11 AM, Mathieu Lacage wrote:

> The attached C++ testcase compares the performance behavior of
> __int128_t used directly vs __int128_t used through an overloaded
> operator <. The overloaded < operator appears faster than the raw
> __int128_t which I find really surprising so, I fear I am not
> measuring what I think I am measuring. Hints ?
> 
> [mathieu@mathieu-laptop benchmark-time]$ g++ --version
> g++ (GCC) 4.4.3 20100127 (Red Hat 4.4.3-4)
> [mathieu@mathieu-laptop benchmark-time]$ g++ -O3 test.cc
> # run raw __int128_t version
> [mathieu@mathieu-laptop benchmark-time]$ time -p ./a.out 100000002 a
> 16384
> 2
> real 0.60
> user 0.60
> sys 0.00
> # run operator < version
> [mathieu@mathieu-laptop benchmark-time]$ time -p ./a.out 100000002 test
> 16384
> 2
> real 0.40
> user 0.40
> sys 0.00

g++ seems to be generating a specialization of run_cmp() in the
__int128_t case, with the parameters a and b fixed at a=1 and b=2, in
an attempt to do some constant propagation.  This ought to help, but
unfortunately the back-end generates worse code for the specialized
case.

This isn't uncommon in optimizing compilers: you do something that
usually improves code quality, but occasionally makes things worse.
If you compile with -fdump-tree-optimized you'll see what is
happening.

Andrew.


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux