On Tue, Jun 25, 2019 at 11:30 PM Nadav Amit <namit@xxxxxxxxxx> wrote: > > > On Jun 25, 2019, at 8:56 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > > > On Tue, Jun 25, 2019 at 8:41 PM Nadav Amit <namit@xxxxxxxxxx> wrote: > >>> On Jun 25, 2019, at 8:35 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote: > >>> > >>> On Tue, Jun 25, 2019 at 7:39 PM Nadav Amit <namit@xxxxxxxxxx> wrote: > >>>>> On Jun 25, 2019, at 2:40 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > >>>>> > >>>>> On 6/12/19 11:48 PM, Nadav Amit wrote: > >>>>>> Support the new interface of flush_tlb_multi, which also flushes the > >>>>>> local CPU's TLB, instead of flush_tlb_others that does not. This > >>>>>> interface is more performant since it parallelize remote and local TLB > >>>>>> flushes. > >>>>>> > >>>>>> The actual implementation of flush_tlb_multi() is almost identical to > >>>>>> that of flush_tlb_others(). > >>>>> > >>>>> This confused me a bit. I thought we didn't support paravirtualized > >>>>> flush_tlb_multi() from reading earlier in the series. > >>>>> > >>>>> But, it seems like that might be Xen-only and doesn't apply to KVM and > >>>>> paravirtualized KVM has no problem supporting flush_tlb_multi(). Is > >>>>> that right? It might be good to include some of that background in the > >>>>> changelog to set the context. > >>>> > >>>> I’ll try to improve the change-logs a bit. There is no inherent reason for > >>>> PV TLB-flushers not to implement their own flush_tlb_multi(). It is left > >>>> for future work, and here are some reasons: > >>>> > >>>> 1. Hyper-V/Xen TLB-flushing code is not very simple > >>>> 2. I don’t have a proper setup > >>>> 3. I am lazy > >>> > >>> In the long run, I think that we're going to want a way for one CPU to > >>> do a remote flush and then, with appropriate locking, update the > >>> tlb_gen fields for the remote CPU. Getting this right may be a bit > >>> nontrivial. > >> > >> What do you mean by “do a remote flush”? > > > > I mean a PV-assisted flush on a CPU other than the CPU that started > > it. If you look at flush_tlb_func_common(), it's doing some work that > > is rather fancier than just flushing the TLB. By replacing it with > > just a pure flush on Xen or Hyper-V, we're losing the potential CR3 > > switch and this bit: > > > > /* Both paths above update our state to mm_tlb_gen. */ > > this_cpu_write(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, mm_tlb_gen); > > > > Skipping the former can hurt idle performance, although we should > > consider just disabling all the lazy optimizations on systems with PV > > flush. (And I've asked Intel to help us out here in future hardware. > > I have no idea what the result of asking will be.) Skipping the > > cpu_tlbstate write means that we will do unnecessary flushes in the > > future, and that's not doing us any favors. > > > > In principle, we should be able to do something like: > > > > flush_tlb_multi(...); > > for(each CPU that got flushed) { > > spin_lock(something appropriate?); > > per_cpu_write(cpu, cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, f->new_tlb_gen); > > spin_unlock(...); > > } > > > > with the caveat that it's more complicated than this if the flush is a > > partial flush, and that we'll want to check that the ctx_id still > > matches, etc. > > > > Does this make sense? > > Thanks for the detailed explanation. Let me check that I got it right. > > You want to optimize cases in which: > > 1. A virtual machine Yes. > > 2. Which issues mtultiple (remote) TLB shootdowns Yes. Or just one followed by a context switch. Right now it's suboptimal with just two vCPUs and a single remote flush. If CPU 0 does a remote PV flush of CPU1 and then CPU1 context switches away from the running mm and back, it will do an unnecessary flush on the way back because the tlb_gen won't match. > > 2. To remote vCPU which is preempted by the hypervisor Yes, or even one that isn't preempted. > > 4. And unlike KVM, the hypervisor does not provide facilities for the VM to > know which vCPU is preempted, and atomically request TLB flush when the vCPU > is scheduled. > I'm not sure this makes much difference to the case I'm thinking of. All this being said, do we currently have any system that supports PCID *and* remote flushes? I guess KVM has some mechanism, but I'm not that familiar with its exact capabilities. If I remember right, Hyper-V doesn't expose PCID yet. > Right? >