Re: [PATCH 09/22] KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)

Sean Christopherson <seanjc@xxxxxxxxxx> · Mon, 9 Sep 2024 08:49:41 -0700

On Fri, Sep 06, 2024, James Houghton wrote:
> On Fri, Sep 6, 2024 at 5:53 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >  #ifdef __x86_64__
> > -                       asm volatile(".byte 0xc6,0x40,0x0,0x0" :: "a" (gpa) : "memory"); /* MOV RAX, [RAX] */
> > +                       asm volatile(".byte 0x48,0x89,0x00" :: "a"(gpa) : "memory"); /* mov %rax, (%rax) */
> 
> FWIW I much prefer the trailing comment you have ended up with vs. the
> one you had before. (To me, the older one _seems_ like it's Intel
> syntax, in which case the comment says it's a load..? The comment you
> have now is, to me, obviously indicating a store. Though... perhaps
> "movq"?)

TL;DR: "movq" is arguably a worse mnemonic than simply "mov" because MOV *and*
        MOVQ are absurdly overloaded mnemonics, and because x86-64 is wonky.

Heh, "movq" is technically a different instruction (MMX/SSE instruction).  For
ambiguous mnemonics, the assembler infers the exact instructions from the operands.
When a register is the source or destination, appending the size to a vanilla MOV
is 100% optional, as the width of the register communicates the desired size
without any ambiguity.

When there is no register operand, e.g. storing an immediate to memory, the size
becomes necessary, sort of.  The assembler will still happily accept an inferred
size, but the size is simply the default operand size for the current mode.

E.g.

  mov $0xffff, (%0)

will generate a 4-byte MOV

  c7 00 ff ff 00 00

so if you actually wanted a 2-byte MOV, the mnemonic needs to be:

  movw $0xffff, (%0)

There is still value in specifying an explicit operand size in assembly, as it
disambiguates the size of human readers, and also generates an error if the
operands mismatch.

E.g.

  movw $0xffff, %%eax

will fail with

  incorrect register `%eax' used with `w' suffix

The really fun one is if ou want to load a 64-bit gpr with an immediate.  All
else being equal, the assembler will generally optimize for code size, and so
if the desired value can be generated by sign-extension, the compiler will opt
for opcode 0xc7 or 0xb8

E.g.

  mov $0xffffffffffffffff, %%rax

generates

  48 c7 c0 ff ff ff ff

whereas, somewhat counter-intuitively, this

  mov $0xffffffff, %%rax

generates the more gnarly

  48 b8 ff ff ff ff 00 00 00 00

But wait, there's more!  If the developer were a wee bit smarter, they could/should
actually write

  mov $0xffffffff, %%eax

to generate

  b8 ff ff ff ff

because in x86-64, writing the lower 32 bits of a 64-bit register architecturally
clears the upper 32 bits.  I mention this because you'll actually see the compiler
take advantage of this behavior.

E.g. if you were to load RAX through an inline asm constraint

  asm volatile(".byte 0xcc" :: "a"(0xffffffff) : "memory");

the generated code will indeed be:

  b8 ff ff ff ff          mov    $0xffffffff,%eax

or if you explicitly load a register with '0'

  31 c0                   xor    %eax,%eax

Lastly, because "%0" in 64-bit mode refers to RAX, not EAX, this:

  asm volatile("mov $0xffffffff, %0" :: "a"(gpa) : "memory");

generates

  48 b8 ff ff ff ff 00 00 00 00

i.e. is equivalent to "mov .., %%rax".

Jumping back to "movq", it's perfectly fine in this case, but also fully
redundant.  And so I would prefer to document it simply as "mov", because "movq"
would be more appropriate to document something like this:

  asm volatile("movq %0, %%xmm0" :: "a"(gpa) : "memory");

  66 48 0f 6e c0          movq   %rax,%xmm0

LOL, which brings up more quirks/warts with x86-64.  Many instructions in x86,
especially SIMD instructions, have mandatory "prefixes" in order to squeeze more
instructions out of the available opcodes.  E.g. the operand size prefix, 0x66,
is reserved for MMX instructions, which allows the architecture to usurp the
reserved combination for XMM instructions.   Table 9-3. Effect of Prefixes on MMX
Instructions says this

  Operand Size (66H)Reserved and may result in unpredictable behavior.

and specifically says "unpredictable behavior" instead of #UD, because prefixing
most MMX instructions with 0x66 "promotes" the instruction to operate on XMM
registers.

And then there's the REX prefix, which is actually four prefixes built into one.
The "base" prefix ix 0x40, with the lower 4 bits encoding the four "real" prefixes.