On 06/16/2010 06:12 PM, Pavel Pavlov wrote: >> From: gcc-help-owner@xxxxxxxxxxx [mailto:gcc-help-owner@xxxxxxxxxxx] On >> Behalf Of Andrew Haley >> >>> By the way, the version that takes hi:lo for the first int64 works fine: >>> >>> static __inline void smlalbb(int * lo, int * hi, int x, int y) { #if >>> defined(__CC_ARM) >>> __asm { smlalbb *lo, *hi, x, y; } >>> #elif defined(__GNUC__) >>> __asm__ __volatile__("smlalbb %0, %1, %2, %3" : "+r"(*lo), "+r"(*hi) >>> : "r"(x), "r"(y)); #endif } >>> >>> >>> void test_smlalXX(int hi, int lo, int a, int b) { >>> smlalbb(&hi, &lo, a, b); >>> smlalbt(&hi, &lo, a, b); >>> smlaltb(&hi, &lo, a, b); >>> smlaltt(&hi, &lo, a, b); >>> } >>> >>> Translates directly into four asm opcodes >> >> Mmmm, but the volatile is wrong. If you need volatile to stop gcc >> from deleting your asm, you have a mistake somewhere. > > I had to add volatile when I had that mess with "=&r" and "0", now I > think it might be removed. > Just tested, and I still need that. The reason I needed that was > because my test function was a noop: > void test_smlalXX(int lo, int hi, int a, int b) > { > smlalbb(&lo, &hi, a, b); > smlalbt(&lo, &hi, a, b); > smlaltb(&lo, &hi, a, b); > smlaltt(&lo, &hi, a, b); > } > Gcc correctly guesses that there is no side effect from that > function if I don't use volatile. So, I removed volatile and added > return for that function: > > uint64_t test_smlalXX(int lo, int hi, int a, int b) > { > smlalbb(&lo, &hi, a, b); > smlalbt(&lo, &hi, a, b); > smlaltb(&lo, &hi, a, b); > smlaltt(&lo, &hi, a, b); > > T64 retval; > > retval.s.hi = hi; > retval.s.lo = lo; > return retval.i64; > } > > The output becomes: > 000000e4 <_Z12test_smlalXXiiii>: > e4: e92d0030 push {r4, r5} > e8: e1410382 smlalbb r0, r1, r2, r3 > ec: e14103c2 smlalbt r0, r1, r2, r3 > f0: e14103a2 smlaltb r0, r1, r2, r3 > f4: e1a05001 mov r5, r1 > f8: e14503e2 smlaltt r0, r5, r2, r3 > fc: e1a04000 mov r4, r0 > 100: e1a01005 mov r1, r5 > 104: e8bd0030 pop {r4, r5} > 108: e12fff1e bx lr > > Basically gcc, gets confused about return variable and generates > useless gunk at the end for the last function. I tried to comment > smlaltt(&lo, &hi, a, b); in the test_smlalXX, and gcc still > generates that same useless code around smlattb I have seen something similar with higher optimization levels, where some pass messes things up a bit. Your mov r4, r0 is very weird, though. I can't explain that. -O1 generates perfect code for me, though. Andrew.