On Fri, Apr 29, 2022 at 09:59:52PM +0000, Sean Christopherson wrote: > Correct, but KVM also doesn't do IBPB on VM-Exit (or VM-Entry), Why doesn't it do that? Not needed? > nor does KVM do IBPB before exiting to userspace. Same question. > The IBPB we want to whack is issued only when KVM is switching vCPUs. Then please document it properly as I've already requested. > Except that _none_ of that documentation explains why the hell KVM > does IBPB when switching betwen vCPUs. Probably because the folks involved in those patches weren't the hell mainly virt people. Although I see a bunch of virt people on CC on that patch. > : But stepping back, why does KVM do its own IBPB in the first place? The goal is > : to prevent one vCPU from attacking the next vCPU run on the same pCPU. But unless > : userspace is running multiple VMs in the same process/mm_struct, switching vCPUs, > : i.e. switching tasks, will also switch mm_structs and thus do IPBP via cond_mitigation. > : > : If userspace runs multiple VMs in the same process, This keeps popping up. Who does that? Can I get a real-life example to such VM-based containers or what the hell that is, pls? > enables cond_ipbp, _and_ sets > : TIF_SPEC_IB, then it's being stupid and isn't getting full protection in any case, > : e.g. if userspace is handling an exit-to-userspace condition for two vCPUs from > : different VMs, then the kernel could switch between those two vCPUs' tasks without > : bouncing through KVM and thus without doing KVM's IBPB. > : > : I can kinda see doing this for always_ibpb, e.g. if userspace is unaware of spectre > : and is naively running multiple VMs in the same process. So this needs a clearer definition: what protection are we even talking about when the address spaces of processes are shared? My naïve thinking would be: none. They're sharing address space - branch pred. poisoning between the two is the least of their worries. So to cut to the chase: it sounds to me like you don't want to do IBPB at all on vCPU switch. And the process switch case is taken care of by switch_mm(). -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette