On Fri, Oct 20, 2023, Pawan Gupta wrote: > During VMentry VERW is executed to mitigate MDS. After VERW, any memory > access like register push onto stack may put host data in MDS affected > CPU buffers. A guest can then use MDS to sample host data. > > Although likelihood of secrets surviving in registers at current VERW > callsite is less, but it can't be ruled out. Harden the MDS mitigation > by moving the VERW mitigation late in VMentry path. > > Note that VERW for MMIO Stale Data mitigation is unchanged because of > the complexity of per-guest conditional VERW which is not easy to handle > that late in asm with no GPRs available. If the CPU is also affected by > MDS, VERW is unconditionally executed late in asm regardless of guest > having MMIO access. > > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx> > --- > arch/x86/kvm/vmx/vmenter.S | 9 +++++++++ > arch/x86/kvm/vmx/vmx.c | 10 +++++++--- > 2 files changed, 16 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S > index be275a0410a8..efa716cf4727 100644 > --- a/arch/x86/kvm/vmx/vmenter.S > +++ b/arch/x86/kvm/vmx/vmenter.S > @@ -1,6 +1,7 @@ > /* SPDX-License-Identifier: GPL-2.0 */ > #include <linux/linkage.h> > #include <asm/asm.h> > +#include <asm/segment.h> > #include <asm/bitsperlong.h> > #include <asm/kvm_vcpu_regs.h> > #include <asm/nospec-branch.h> > @@ -31,6 +32,8 @@ > #define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE > #endif > > +#define GUEST_CLEAR_CPU_BUFFERS USER_CLEAR_CPU_BUFFERS > + > .macro VMX_DO_EVENT_IRQOFF call_insn call_target > /* > * Unconditionally create a stack frame, getting the correct RSP on the > @@ -177,10 +180,16 @@ SYM_FUNC_START(__vmx_vcpu_run) > * the 'vmx_vmexit' label below. > */ > .Lvmresume: > + /* Mitigate CPU data sampling attacks .e.g. MDS */ > + GUEST_CLEAR_CPU_BUFFERS I have a very hard time believing that it's worth duplicating the mitigation for VMRESUME vs. VMLAUNCH just to land it after a Jcc. 3b1: 48 8b 00 mov (%rax),%rax 3b4: 74 18 je 3ce <__vmx_vcpu_run+0x9e> 3b6: eb 0e jmp 3c6 <__vmx_vcpu_run+0x96> 3b8: 0f 00 2d 05 00 00 00 verw 0x5(%rip) # 3c4 <__vmx_vcpu_run+0x94> 3bf: 0f 1f 80 00 00 18 00 nopl 0x180000(%rax) 3c6: 0f 01 c3 vmresume 3c9: e9 c9 00 00 00 jmp 497 <vmx_vmexit+0xa7> 3ce: eb 0e jmp 3de <__vmx_vcpu_run+0xae> 3d0: 0f 00 2d 05 00 00 00 verw 0x5(%rip) # 3dc <__vmx_vcpu_run+0xac> 3d7: 0f 1f 80 00 00 18 00 nopl 0x180000(%rax) 3de: 0f 01 c2 vmlaunch Also, would it'd be better to put the NOP first? Or even better, out of line? It'd be quite hilarious if the CPU pulled a stupid and speculated on the operand of the NOP, i.e. if the user/guest controlled RAX allowed for pulling in data after the VERW.