On Wed, Jul 13, 2022, Sean Christopherson wrote: > On Wed, Jul 13, 2022, Vitaly Kuznetsov wrote: > > It seems to be a misconception that "A" places an u64 operand to > > EAX:EDX, at least with GCC11. > > It's not a misconception, it's just that the "A" trick only works for 32-bit > binaries. For 64-bit, the 64-bit integer fits into "rax" without needing to spill > into "rdx". > > I swear I had fixed this, but apparently I had only done that locally and never > pushed/posted the changes :-/ Ugh, I have a feeling I fixed RDMSR and then forgot about WRSMR. > > While writing a new test, I've noticed that wrmsr_safe() tries putting > > garbage to the upper bits of the MSR, e.g.: > > > > kvm_exit: reason MSR_WRITE rip 0x402919 info 0 0 > > kvm_msr: msr_write 40000118 = 0x60000000001 (#GP) > > ... > > when it was supposed to write '1'. Apparently, "A" works the same as > > "a" and not as EAX/EDX. Here's the relevant disassembled part: > > > > With "A": > > > > 48 8b 43 08 mov 0x8(%rbx),%rax > > 49 b9 ba da ca ba 0a movabs $0xabacadaba,%r9 > > 00 00 00 > > 4c 8d 15 07 00 00 00 lea 0x7(%rip),%r10 # 402f44 <guest_msr+0x34> > > 4c 8d 1d 06 00 00 00 lea 0x6(%rip),%r11 # 402f4a <guest_msr+0x3a> > > 0f 30 wrmsr > > > > With "a"/"d": > > > > 48 8b 43 08 mov 0x8(%rbx),%rax > > 48 89 c2 mov %rax,%rdx > > 48 c1 ea 20 shr $0x20,%rdx Huh. This is wrong. RAX is loaded with the full 64-bit value. It doesn't matter for WRMSR because WRMSR only consumes EAX, but it's wrong. I can't for the life of me figure out why casting to a u32 doesn't force the compiler to truncate the value. Truncation in other places most definitely works, and the compiler loads only EAX and EDX when using a hardcoded value, e.g. -1ull, so the input isn't messed up. There's no 32-bit loads of EAX, so no implicit truncation of RAX[63:32]. gcc-{7,9,11} and clang-13 generate the same code, so either it's a really longstanding bug, or maybe some funky undocumented behavior? If I use return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr)); then the result is as expected: 48 8b 53 08 mov 0x8(%rbx),%rdx 89 d0 mov %edx,%eax 48 c1 ea 20 shr $0x20,%rdx I'll post a v2 of just this patch on your behalf, I also reworded the changelog to include the gcc documentation that talks about the behavior of "A".