On 06/16/2010 05:11 PM, Pavel Pavlov wrote: >> -----Original Message----- >> On 06/16/2010 01:15 PM, Andrew Haley wrote: >>> On 06/16/2010 11:23 AM, Pavel Pavlov wrote: > ... >> inline uint64_t smlalbb(uint64_t acc, unsigned int lo, unsigned int hi) { >> union >> { >> uint64_t ll; >> struct >> { >> unsigned int l; >> unsigned int h; >> } s; >> } retval; >> >> retval.ll = acc; >> >> __asm__("smlalbb %0, %1, %2, %3" >> : "+r"(retval.s.l), "+r"(retval.s.h) >> : "r"(lo), "r"(hi)); >> >> return retval.ll; >> } >> > > [Pavel Pavlov] > Later on I found out that I had to use +r constraint, but then, when I use that function for example like that: > int64_t rsmlalbb64(int64_t i, int x, int y) > { > return smlalbb64(i, x, y); > } > > Gcc generates this asm: > <rsmlalbb64>: > push {r4, r5} > mov r4, r0 > mov ip, r1 > smlalbb r4, ip, r2, r3 > mov r5, ip > mov r0, r4 > mov r1, ip > pop {r4, r5} > bx lr > > It's bizarre what gcc is doing in that function, I understand if it > can't optimize and correctly use r0 and r1 directly, but from that > listing it looks as if gcc got drunk and decided to touch r5 for > absolutely no reason! > > the expected out should have been like that: > <rsmlalbb64>: > smlalbb r0, r1, r2, r3 > bx lr > > I'm using cegcc 4.1.0 and I compile with > arm-mingw32ce-g++ -O3 -mcpu=arm1136j-s -c ARM_TEST.cpp -o ARM_TEST_GCC.obj > > Is there a way to access individual parts of that 64-bit input > integer or, is there a way to specify that two 32-bit integers > should be treated as a Hi:Lo parts of 64 bit variable. It's commonly > done with a temporary, but the result is that gcc generates to much > junk. Why don't you just use the function I sent above? It generates smlalbb: smlalbb r0, r1, r2, r3 mov pc, lr smlalXX64: smlalbb r0, r1, r2, r3 smlalbt r0, r1, r2, r3 smlaltb r0, r1, r2, r3 smlaltt r0, r1, r2, r3 mov pc, lr Andrew.