Re: Inline Assembly queries

kernel mailz <kernelmailz@xxxxxxxxxxxxxx> · Tue, 30 Jun 2009 10:57:07 +0530

Hi Scott,
I agree with you, kind of understand that it is required.
But buddy unless you see some construct work or by adding the
construct a visible difference is there, the concept is just piece of
theory.

I am trying all the kernel code inline assembly to find an example
that works differently with memory.

For instance take atomic_add , atomic_add_return, while the
atomic_add_return has the "memory", atomic_add skips it.

-TZ

On Tue, Jun 30, 2009 at 12:57 AM, Scott Wood<scottwood@xxxxxxxxxxxxx> wrote:
> On Mon, Jun 29, 2009 at 09:19:57PM +0530, kernel mailz wrote:
>> I tried a small example
>>
>> int *p = 0x1000;
>> int a = *p;
>> asm("sync":::"memory");
>> a = *p;
>>
>> and
>>
>> volatile int *p = 0x1000;
>> int a = *p;
>> asm("sync");
>> a = *p
>>
>> Got the same assembly.
>> Which is right.
>>
>> So does it mean, if proper use of volatile is done, there is no need
>> of "memory" ?
>
> No.  As I understand it, volatile concerns deletion of the asm statement
> (if no outputs are used) and reordering with respect to other asm
> statements (not sure whether GCC will actually do this), while the memory
> clobber concerns optimization of non-asm loads/stores around the asm
> statement.
>
>> static inline unsigned long
>> __xchg_u32(volatile void *p, unsigned long val)
>> {
>>        unsigned long prev;
>>
>>        __asm__ __volatile__(
>>
>> "1:     lwarx   %0,0,%2 \n"
>>
>> "       stwcx.  %3,0,%2 \n\
>>        bne-    1b"
>>
>>        : "=&r" (prev), "+m" (*(volatile unsigned int *)p)
>>        : "r" (p), "r" (val)
>> //        :"memory","cc");
>>
>>        return prev;
>> }
>> #define ADDR 0x1000
>> int main()
>> {
>>        __xchg_u32((void*)ADDR, 0x2000);
>>        __xchg_u32((void*)ADDR, 0x3000);
>>
>>        return 0;
>>
>> }
>>
>> Got the same asm, when compiled with O1 , with / without "memory" clobber
>
> This isn't a good test case, because there's nothing other than inline
> asm going on in that function for GCC to optimize.  Plus, it's generally
> not a good idea, when talking about what the compiler is or isn't allowed
> to do, to point to a single test case (or even several) and say that it
> isn't required because you don't notice a difference.  Even if there were
> no code at all with which it made a difference with GCC version X, it
> could make a difference with GCC version X+1.
>
> -Scott
>