On Thu, Sep 24, 2020 at 08:24:46AM +0000, David Laight wrote: > static inline void movdir64b(void *dst, const void *src) > { > /* > * 64 bytes from dst are marked as modified for completeness. > * Since the writes bypass the cache later reads may return > * old data anyway. > */ > /* MOVDIR64B [rdx], rax */ > asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" > : "=m" ((struct { char _[64];} *)dst), > : "m" ((struct { char _[64];} *)src), "d" (src), "a" (dst)); Now since you're so generous with your advice on random threads, please explain what you're advising here? The destination operand - in this case in %rax - is "destination memory address specified as offset to ES segment in the register operand." So what is the difference between: ...(void *dst, ... ) volatile struct { char _[64]; } *__dst = dst; ... : "=m" (__dst) : "a" (__dst) and ...(void *dst, ... ) ... : "=m" ((struct { char _[64];} *)dst) : "a" (__dst) and why? Point me to the gcc documentation where this is explained. To cut to the chase, I don't think you need to do that, otherwise clwb() would be broken too but perhaps you know something I don't. Looking at clwb(), I believe the proper specification should be: volatile struct { char _[64]; } *__dst = dst; ... : "+m" (__dst) : "a" (__dst) And if anything, the source specification should be something like that: volatile struct { char x[64]; } *__src = src; ... "d" (__src) because this tells gcc that the source operand would read 64 bytes through the pointer in the %rdx reg. So this ends up close to what you're saying but it is using local variables to make the asm actually readable. Lemme add Micha to Cc for sanity-checking: Micha, the instruction is: MOVDIR64B %(rdx), rax "Move 64-bytes as direct-store with guaranteed 64-byte write atomicity from the source memory operand address to destination memory address specified as offset to ES segment in the register operand." Do I need to tell gcc that both operands are referencing 64 bytes, source operand is a memory reference, destination operand is an address specified in a register? What we have currently is: volatile struct { char _[64]; } *dst = __dst; /* MOVDIR64B [rdx], rax */ asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" : "=m" (dst) : "d" (from), "a" (dst)); Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette