Amusing bad code generation for ARM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've been investigating a large performance regression in ARM code generation in 4.5.0 (compared to 4.4.x), but while doing so I stumbled upon something else, which is not a regression from 4.4.x. The generated assembly code is so horrible it's almost comical. The C code looks like this:

    struct dev_t {
        volatile unsigned R0;
        volatile unsigned R1;
    };
    #define DEV               ((struct dev_t*)0x40011400)
    void write_data(const unsigned *d)
    {
        unsigned i, mask;
        for (i = 0, mask = 1; i < 8; i++, mask <<= 1) {
            if (mask & *d)
                DEV->R0 = 1U << 13;
            else
                DEV->R1 = 1U << 13;
        }
    }

Compiling this with a 4.5.0 cross compiler for arm-none-eabi with:

    arm-none-eabi-gcc -mcpu=cortex-m3 -mthumb -S -O3 -o- bad.c

gives the following assembly code:

    ...
    ldr    r3, [r0, #0]
    tst    r3, #32
    itete    eq
    moveq    r3, #5120
    movne    r3, #5120
    moveq    r2, #8192
    movne    r2, #8192
    itete    eq
    movteq    r3, 16385
    movtne    r3, 16385
    streq    r2, [r3, #4]
    strne    r2, [r3, #0]
    ldr    r3, [r0, #0]
    tst    r3, #64
    itete    eq
    moveq    r3, #5120
    movne    r3, #5120
    moveq    r2, #8192
    movne    r2, #8192
    itete    eq
    movteq    r3, 16385
    movtne    r3, 16385
    streq    r2, [r3, #4]
    strne    r2, [r3, #0]
    ...

Note how the moveq/movne pairs always load the same value. If equal load 5120 else load 5120. Also note that it loads the addresses every iteration instead of just once before the (unrolled) loop.

It's also pointlessly reloading the value of *d ("[r0]") to r3 every time, which is a regression from 4.4.3, and the root of my primary performance regression.

Should I file zero, one or two bugs about this?

/Tobias


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux