> -----Original Message----- > From: gcc-help-owner@xxxxxxxxxxx [mailto:gcc-help-owner@xxxxxxxxxxx] On > Behalf Of Andrew Haley > Sent: Wednesday, June 16, 2010 12:52 > To: gcc-help@xxxxxxxxxxx > Subject: Re: Inline asm for ARM > > On 06/16/2010 05:40 PM, Pavel Pavlov wrote: > >> -----Original Message----- > >> From: Andrew Haley [mailto:aph@xxxxxxxxxx] On 06/16/2010 05:11 PM, > >> Pavel Pavlov wrote: > >>>> -----Original Message----- > >>>> On 06/16/2010 01:15 PM, Andrew Haley wrote: > >>>>> On 06/16/2010 11:23 AM, Pavel Pavlov wrote: > >>> ... > >>>> inline uint64_t smlalbb(uint64_t acc, unsigned int lo, unsigned int hi) { > >>>> union > >>>> { > >>>> uint64_t ll; > >>>> struct > >>>> { > >>>> unsigned int l; > >>>> unsigned int h; > >>>> } s; > >>>> } retval; > >>>> > >>>> retval.ll = acc; > >>>> > >>>> __asm__("smlalbb %0, %1, %2, %3" > >>>> : "+r"(retval.s.l), "+r"(retval.s.h) > >>>> : "r"(lo), "r"(hi)); > >>>> > >>>> return retval.ll; > >>>> } > >>>> > >>> > >>> [Pavel Pavlov] > >>> Later on I found out that I had to use +r constraint, but then, when > >>> I use that > >> function for example like that: > >>> int64_t rsmlalbb64(int64_t i, int x, int y) { > >>> return smlalbb64(i, x, y); > >>> } > >>> > >>> Gcc generates this asm: > >>> <rsmlalbb64>: > >>> push {r4, r5} > >>> mov r4, r0 > >>> mov ip, r1 > >>> smlalbb r4, ip, r2, r3 > >>> mov r5, ip > >>> mov r0, r4 > >>> mov r1, ip > >>> pop {r4, r5} > >>> bx lr > >>> > >>> It's bizarre what gcc is doing in that function, I understand if it > >>> can't optimize and correctly use r0 and r1 directly, but from that > >>> listing it looks as if gcc got drunk and decided to touch r5 for > >>> absolutely no reason! > >>> > >>> the expected out should have been like that: > >>> <rsmlalbb64>: > >>> smlalbb r0, r1, r2, r3 > >>> bx lr > >>> > >>> I'm using cegcc 4.1.0 and I compile with > >>> arm-mingw32ce-g++ -O3 -mcpu=arm1136j-s -c ARM_TEST.cpp -o > >>> arm-mingw32ce-g++ ARM_TEST_GCC.obj > >>> > >>> Is there a way to access individual parts of that 64-bit input > >>> integer or, is there a way to specify that two 32-bit integers > >>> should be treated as a Hi:Lo parts of 64 bit variable. It's commonly > >>> done with a temporary, but the result is that gcc generates to much junk. > >> > >> Why don't you just use the function I sent above? It generates > >> > >> smlalbb: > >> smlalbb r0, r1, r2, r3 > >> mov pc, lr > >> > >> smlalXX64: > >> smlalbb r0, r1, r2, r3 > >> smlalbt r0, r1, r2, r3 > >> smlaltb r0, r1, r2, r3 > >> smlaltt r0, r1, r2, r3 > >> mov pc, lr > >> > > > > [Pavel Pavlov] > > What's your gcc -v? The output I posted comes from your function. > > 4.3.0 > > Perhaps your compiler options were wrong? Dunno. > > Andrew.