Question about asm memory constraints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm getting some unexpected behavior from gcc. I'm not prepared to call it a bug. I just want to understand what I'm seeing.

In my code (included below), I:

1) Create a 100byte buffer, and set buff[5] to 'A'.
2) Call __stosb, which uses inline asm to overwrite all of buff with 'B'.
3) Use a memory constraint in __stosb to flush buff instead of using the "memory" clobber. The size of the memory block used in the constraint is controlled by a #define.

With this, I have a simple test to see if the memory constraint is correctly causing the buffer to get flushed by the asm call. If it is flushing the buffer, printing buff[5] after __stosb will print 'B'. If it is not flushing, it will print 'A'. The results were a bit surprising.

- Since buff[5] is the 6th byte in the buffer, using memory constraint sizes of 1, 2 & 4 (not surprisingly) all print 'A'. - Sizes of 8 and 16 print 'B'. This is also the expected result, since I am now flushing enough of buff to include buff[5]. - The surprise comes from using a size of 3 or 5. These also print 'B'. WTF? Why would 4 not flush, and 3 flush?

I believe the answer comes from reading the RTL. The difference between sizes of 3 and 16 comes here:

   (set (mem/c:BLK (plus:DI (reg/f:DI 7 sp)
(const_int 32 [0x20])) [ MEM[(struct _reallybigstruct *)&buff]+0 S3 A128])
         (asm_operands/v:BLK ("rep stos{b|b}") ("=m") 2 [

  (set (mem/c:TI (plus:DI (reg/f:DI 7 sp)
(const_int 32 [0x20])) [ MEM[(struct _reallybigstruct *)&buff]+0 S16 A128])
     (asm_operands/v:TI ("rep stos{b|b}") ("=m") 2 [

While I don't really read RTL, TI clearly refers to TIMode. Apparently when using a size that exactly matches a mode, asm memory references can flush the right number of bytes . But if not, gcc seems to falls back to BLK mode.

Which brings us to the essential question here:
Does using BLK mode here *just* flush all of buff? Or does it perform a full asm "memory" clobber and flush everything?

I've been experimenting, and (unfortunately) it looks like it does the full clobber (see second program below), but I could use some confirmation. I could also use an opinion on whether that is the intended behavior, or is something just going wrong.

Being able to use memory constraints could be a nice performance win over forcing a full memory clobber.

Thanks,
dw

------------------------------------------------------------
Here's the code (compiled with gcc version 4.9.0 x86_64-win32-seh-rev2, using -O2 -fdump-final-insns):

// Code that shows weirdness with memory constraints
#include <stdio.h>

#define MYSIZE 3

inline void
__stosb(unsigned char *Dest, unsigned char Data, size_t Count)
{
   struct _reallybigstruct { char x[MYSIZE]; }
      *p = (struct _reallybigstruct *)Dest;

   __asm__ __volatile__ ("rep stos{b|b}"
      : "+D" (Dest), "+c" (Count), "=m" (*p)
      : [Data] "a" (Data)
      //: "memory"
   );
}

int main()
{
   unsigned char buff[100];
   buff[5] = 'A';

   __stosb(buff, 'B', sizeof(buff));
   printf("%c\n", buff[5]);
}

-------------------------------------
Here is my attempt to prove that a full clobber is being performed. Compile this code (as above), and look at the -S output. If using a size of 8, the assignment for buff2 is after the "rep stosb". Change this to size 3, and it moves it before. If 3 is really causing a full memory clobber and 8 is not, this is the behavior I would expect. While not exactly conclusive, it sure looks like a full clobber.

// Code that tries to prove a full "memory" clobber is being performed.
#include <stdio.h>

#define MYSIZE 3

inline void
__stosb(unsigned char *Dest, unsigned char Data, size_t Count)
{
   struct _reallybigstruct { char x[MYSIZE]; }
      *p = (struct _reallybigstruct *)Dest;

   __asm__ __volatile__ ("rep stos{b|b}"
      : "+D" (Dest), "+c" (Count), "=m" (*p)
      : [Data] "a" (Data)
      //: "memory"
   );
}

int main()
{
   unsigned char buff1[100], buff2[100];
   buff1[5] = 'A';
   buff2[5] = 'M';
   asm("#" : : "r" (buff2));

   __stosb(buff1, 'B', sizeof(buff1));
   printf("%c %c\n", buff1[5], buff2[5]);
}





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux