On Tue, 2023-04-18 at 07:44 -0700, Sean Christopherson wrote: > On Tue, Apr 18, 2023, David Woodhouse wrote: > > On Mon, 2023-04-17 at 09:31 -0700, Sean Christopherson wrote: > > > On Mon, Apr 17, 2023, Metin Kaya wrote: > > > > HVMOP_flush_tlbs suboperation of hvm_op hypercall allows a guest to > > > > flush all vCPU TLBs. There is no way for the VMM to flush TLBs from > > > > userspace. > > > > > > Ah, took me a minute to connect the dots.� Monday morning is definitely partly > > > to blame, but it would be helpful to expand this sentence to be more explicit as > > > to why userspace's inability to efficiently flush TLBs. > > > > > > And strictly speaking, userspace _can_ flush TLBs, just not in a precise, efficient > > > way. > > > > Hm, how? We should probably implement that in userspace as a fallback, > > however much it sucks. > > Oh, the suckage is high :-) Use KVM_{G,S}ET_SREGS2 to toggle any CR{0,3,4}/EFER > bit and __set_sregs() will reset the MMU context. Oh, that suckage is quite high indeed. I'm not actually sure I'll bother doing this in QEMU. It's quite esoteric; Linux has never used it and I think that after launching <mumble> million production Xen guests this way we've only ever seen it used by some FreeBSD variant. It doesn't seem to be in mainline FreeBSD; there's a hint of it here https://people.freebsd.org/~dfr/freebsd-6.x-xen-31032009.diff for example but it's disabled even then: if (pmap == kernel_pmap || pmap->pm_active == all_cpus) { - invltlb(); - smp_invltlb(); +#if defined(XENHVM) && defined(notdef) + /* + * As far as I can tell, this makes things slower, at + * least where there are only two physical cpus and + * the host is not overcommitted. + */ + if (is_running_on_xen()) { + HYPERVISOR_hvm_op(HVMOP_flush_tlbs, NULL); + } else +#endif + { + invltlb(); + smp_invltlb(); + } > Note that without this fix[*] > that I'm going to squeeze into 6.4, the MMU context reset may result in all TDP > MMU roots being freed and reallocated. > > [*] https://lore.kernel.org/all/20230413231251.1481410-1-seanjc@xxxxxxxxxx > > > > > > > �arch/x86/kvm/xen.c���������������� | 31 ++++++++++++++++++++++++++++++ > > > > �include/xen/interface/hvm/hvm_op.h |� 3 +++ > > > > > > Modifications to uapi headers is conspicuously missing.� I.e. there likely needs > > > to be a capability so that userspace can query support. > > > > Nah, nobody cares. If the kernel "accelerates" this hypercall, so be > > it. Userspace will just never get the KVM_EXIT_XEN for that hypercall > > because it'll be magically handled, like the others. > > Ah, that makes sense, I was thinking userspace would complain if it got the > "unexpected" exit. I've tried to follow a model where userspace is *always* expected to implement the hypercall, and if the kernel chooses to intervene to accelerate it, that's a bonus. It saves a bunch of complexity in error handling in the kernel when we can just say "screw this, let userspace cope".
Attachment:
smime.p7s
Description: S/MIME cryptographic signature