> On May 10, 2022, at 10:44 AM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Sat, Apr 30, 2022, Borislav Petkov wrote: >> On Sat, Apr 30, 2022 at 02:50:35PM +0000, Jon Kohler wrote: >>> This is 100% a fair ask, I appreciate the diligence, as we’ve all been there >>> on the ‘other side’ of changes to complex areas and spend hours digging on >>> git history, LKML threads, SDM/APM, and other sources trying to derive >>> why the heck something is the way it is. >> >> Yap, that's basically proving my point and why I want stuff to be >> properly documented so that the question "why was it done this way" can >> always be answered satisfactorily. >> >>> AFAIK, the KVM IBPB is avoided when switching in between vCPUs >>> belonging to the same vmcs/vmcb (i.e. the same guest), e.g. you could >>> have one VM highly oversubscribed to the host and you wouldn’t see >>> either the KVM IBPB or the switch_mm IBPB. All good. >>> >>> Reference vmx_vcpu_load_vmcs() and svm_vcpu_load() and the >>> conditionals prior to the barrier. >> >> So this is where something's still missing. >> >>> However, the pain ramps up when you have a bunch of separate guests, >>> especially with a small amount of vCPUs per guest, so the switching is more >>> likely to be in between completely separate guests. >> >> If the guests are completely separate, then it should fall into the >> switch_mm() case. >> >> Unless it has something to do with, as I looked at the SVM side of >> things, the VMCBs: >> >> if (sd->current_vmcb != svm->vmcb) { >> >> So it is not only different guests but also within the same guest and >> when the VMCB of the vCPU is not the current one. > > Yep. > >> But then if VMCB of the vCPU is not the current, per-CPU VMCB, then that >> CPU ran another guest so in order for that other guest to attack the >> current guest, then its branch pred should be flushed. > > That CPU ran a different _vCPU_, whether or not it ran a different guest, i.e. a > different security domain, is unknown. > >> But I'm likely missing a virt aspect here so I'd let Sean explain what >> the rules are... > > I don't think you're missing anything. I think the original 15d45071523d ("KVM/x86: > Add IBPB support") was simply wrong. > > As I see it: > > 1. If the vCPUs belong to the same VM, they are in the same security domain and > do not need an IPBP. > > 2. If the vCPUs belong to different VMs, and each VM is in its own mm_struct, > defer to switch_mm_irqs_off() to handle IBPB as an mm switch is guaranteed to > occur prior to loading a vCPU belonging to a different VMs. > > 3. If the vCPUs belong to different VMs, but multiple VMs share an mm_struct, > then the security benefits of an IBPB when switching vCPUs are dubious, at best. > > If we only consider #1 and #2, then KVM doesn't need an IBPB, period. > > #3 is the only one that's a grey area. I have no objection to omitting IBPB entirely > even in that case, because none of us can identify any tangible value in doing so. Thanks, Sean. Our messages crossed in flight, I sent a reply to your earlier message just a bit ago. This is super helpful to frame this up. What would you think framing the patch like this: x86/speculation, KVM: remove IBPB on vCPU load Remove IBPB that is done on KVM vCPU load, as the guest-to-guest attack surface is already covered by switch_mm_irqs_off() -> cond_mitigation(). The original 15d45071523d ("KVM/x86: Add IBPB support") was simply wrong in its guest-to-guest design intention. There are three scenarios at play here: 1. If the vCPUs belong to the same VM, they are in the same security domain and do not need an IPBP. 2. If the vCPUs belong to different VMs, and each VM is in its own mm_struct, switch_mm_irqs_off() will handle IBPB as an mm switch is guaranteed to occur prior to loading a vCPU belonging to a different VMs. 3. If the vCPUs belong to different VMs, but multiple VMs share an mm_struct, then the security benefits of an IBPB when switching vCPUs are dubious, at best. Issuing IBPB from KVM vCPU load would only cover #3, but there are no real world tangible use cases for such a configuration. If multiple VMs are sharing an mm_structs, prediction attacks are the least of their security worries. Fixes: 15d45071523d ("KVM/x86: Add IBPB support") (Reviewedby/signed of by people here) (Code change simply whacks IBPB in KVM vmx/svm and thats it)