Re: GCC is 7 times slower than Intel? How to optimize? Need help!!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

On Mon, 2008-10-06 at 23:23 -0700, jackfrost wrote: 
> I've tried  -ffast-math.
> with the same result.

On Tue, 2008-10-07 at 13:05 -0400, Michael Meissner wrote:
> IIRC, one of the spec 2006 suite is heavily dominated by calls to exp.
> So
> programs do exist that are dominated by the 'exp' function.

If you indeed need exponent, so it wasn't in your code sample by chance
and if you can tolerate limited argument range and limited precision,
you can try to use well known hack to calculate exponent faster, at
least on some hardware, like:

template<typename T> float fast_2pow(T arg)
{
    ASSERT(arg > -127);
    ASSERT(arg <  127);

    typedef union
    {
        unsigned int u;
        float        f;
    } uf_t;

    const uf_t x = {int(arg * (1 << 23)) + int(127 * (1 << 23))};
    //reset mantissa bits,  f: 2 ^ floor(arg)
    const uf_t exp = { x.u & ~((1 << 23) - 1)};
    //set exponent & sign bits, f: arg - floor(arg) + 1
    const uf_t man = {(x.u &  ((1 << 23) - 1)) | (127 << 23)}; 

    //return approximation
    return exp.f * ((man.f * man.f + 2.) * 0.3330234735869276f);
}

template<typename T> float fast_exp(T arg)
{
    static const T scale = 1./log(2.);
    return fast_2pow(arg * scale);
}

(template parameter is ment to be float or double, not integer :)

This 2-nd order approximation (continuous values, continuous derivative,
minimized squared relative error) gives relative error < 0.3% and is
~6.6 times faster than exp(..) on AMD Turion(tm) 64 X2, gcc 4.3.0,
compile switches: -march=native -msse2 -O3 -ffast-math -mfpmath=sse,
using your code sample:
http://gcc.gnu.org/ml/gcc-help/2008-10/msg00033.html

I have just tested 4-th order approximation. It gives relative error <
1e-5 and still is ~5.7 times faster than exp(..).

Regards,
Arturs Zoldners



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux