Re: Inline asm for ARM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/16/2010 05:40 PM, Pavel Pavlov wrote:
>> -----Original Message-----
>> From: Andrew Haley [mailto:aph@xxxxxxxxxx]
>> On 06/16/2010 05:11 PM, Pavel Pavlov wrote:
>>>> -----Original Message-----
>>>> On 06/16/2010 01:15 PM, Andrew Haley wrote:
>>>>> On 06/16/2010 11:23 AM, Pavel Pavlov wrote:
>>> ...
>>>> inline uint64_t smlalbb(uint64_t acc, unsigned int lo, unsigned int hi) {
>>>>   union
>>>>   {
>>>>     uint64_t ll;
>>>>     struct
>>>>     {
>>>>       unsigned int l;
>>>>       unsigned int h;
>>>>     } s;
>>>>   } retval;
>>>>
>>>>   retval.ll = acc;
>>>>
>>>>   __asm__("smlalbb %0, %1, %2, %3"
>>>> 	  : "+r"(retval.s.l), "+r"(retval.s.h)
>>>> 	  : "r"(lo), "r"(hi));
>>>>
>>>>   return retval.ll;
>>>> }
>>>>
>>>
>>> [Pavel Pavlov]
>>> Later on I found out that I had to use +r constraint, but then, when I use that
>> function for example like that:
>>> int64_t rsmlalbb64(int64_t i, int x, int y) {
>>> 	return smlalbb64(i, x, y);
>>> }
>>>
>>> Gcc generates this asm:
>>> <rsmlalbb64>:
>>> push	{r4, r5}
>>> mov	r4, r0
>>> mov	ip, r1
>>> smlalbb	r4, ip, r2, r3
>>> mov	r5, ip
>>> mov	r0, r4
>>> mov	r1, ip
>>> pop	{r4, r5}
>>> bx	lr
>>>
>>> It's bizarre what gcc is doing in that function, I understand if it
>>> can't optimize and correctly use r0 and r1 directly, but from that
>>> listing it looks as if gcc got drunk and decided to touch r5 for
>>> absolutely no reason!
>>>
>>> the expected out should have been like that:
>>> <rsmlalbb64>:
>>> smlalbb	r0, r1, r2, r3
>>> bx	lr
>>>
>>> I'm using cegcc 4.1.0 and I compile with
>>> arm-mingw32ce-g++ -O3 -mcpu=arm1136j-s -c ARM_TEST.cpp -o
>>> arm-mingw32ce-g++ ARM_TEST_GCC.obj
>>>
>>> Is there a way to access individual parts of that 64-bit input integer
>>> or, is there a way to specify that two 32-bit integers should be
>>> treated as a Hi:Lo parts of 64 bit variable. It's commonly done with a
>>> temporary, but the result is that gcc generates to much junk.
>>
>> Why don't you just use the function I sent above?  It generates
>>
>> smlalbb:
>> 	smlalbb r0, r1, r2, r3
>> 	mov	pc, lr
>>
>> smlalXX64:
>> 	smlalbb r0, r1, r2, r3
>> 	smlalbt r0, r1, r2, r3
>> 	smlaltb r0, r1, r2, r3
>> 	smlaltt r0, r1, r2, r3
>> 	mov	pc, lr
>>
> 
> [Pavel Pavlov] 
> What's your gcc -v? The output I posted comes from your function.

4.3.0

Perhaps your compiler options were wrong?  Dunno.

Andrew.


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux