> On Apr 18, 2022, at 12:28 PM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Fri, Apr 15, 2022, Jon Kohler wrote: >> >>> On Apr 15, 2022, at 10:28 AM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: >>> But stepping back, why does KVM do its own IBPB in the first place? The goal is >>> to prevent one vCPU from attacking the next vCPU run on the same pCPU. But unless >>> userspace is running multiple VMs in the same process/mm_struct, switching vCPUs, >>> i.e. switching tasks, will also switch mm_structs and thus do IPBP via cond_mitigation. >> >> Good question, I couldn’t figure out the answer to this by walking the code and looking >> at git history/blame for this area. Are there VMMs that even run multiple VMs within >> the same process? The only case I could think of is a nested situation? > > Selftests? :-) Ah! I’ll take a mulligan, I was only thinking about run of the mill user stuff, not the tests, thx. > >>> If userspace runs multiple VMs in the same process, enables cond_ipbp, _and_ sets >>> TIF_SPEC_IB, then it's being stupid and isn't getting full protection in any case, >>> e.g. if userspace is handling an exit-to-userspace condition for two vCPUs from >>> different VMs, then the kernel could switch between those two vCPUs' tasks without >>> bouncing through KVM and thus without doing KVM's IBPB. >> >> Exactly, so meaning that the only time this would make sense is for some sort of nested >> situation or some other funky VMM tomfoolery, but that nested hypervisor might not be >> KVM, so it's a farce, yea? Meaning that even in that case, there is zero guarantee >> from the host kernel perspective that barriers within that process are being issued on >> switch, which would make this security posture just window dressing? >> >>> >>> I can kinda see doing this for always_ibpb, e.g. if userspace is unaware of spectre >>> and is naively running multiple VMs in the same process. >> >> Agreed. I’ve thought of always_ibpb as "paranoid mode" and if a user signs up for that, >> they rarely care about the fast path / performance implications, even if some of the >> security surface area is just complete window dressing :( >> >> Looking forward, what if we simplified this to have KVM issue barriers IFF always_ibpb? >> >> And drop the cond’s, since the switching mm_structs should take care of that? >> >> The nice part is that then the cond_mitigation() path handles the going to thread >> with flag or going from a thread with flag situation gracefully, and we don’t need to >> try to duplicate that smarts in kvm code or somewhere else. > > Unless there's an edge case we're overlooking, that has my vote. And if the > above is captured in a comment, then there shouldn't be any confusion as to why > the kernel/KVM is consuming a flag named "switch_mm" when switching vCPUs, i.e. > when there may or may not have been a change in mm structs. Ok great. I’ll work up a v2 and send it out.