Re: [PATCH v4 25/32] KVM: PPC: Book3S HV: Invalidate TLB when nested vcpu moves physical cpu

David Gibson <david@xxxxxxxxxxxxxxxxxxxxx> · Mon, 8 Oct 2018 13:02:16 +1100

On Fri, Oct 05, 2018 at 03:32:26PM +1000, Paul Mackerras wrote:
> On Fri, Oct 05, 2018 at 02:54:28PM +1000, David Gibson wrote:
> > On Fri, Oct 05, 2018 at 02:23:50PM +1000, Paul Mackerras wrote:
> > > On Fri, Oct 05, 2018 at 02:09:08PM +1000, David Gibson wrote:
> > > > On Thu, Oct 04, 2018 at 09:56:02PM +1000, Paul Mackerras wrote:
> > > > > From: Suraj Jitindar Singh <sjitindarsingh@xxxxxxxxx>
> > > > > 
> > > > > This is only done at level 0, since only level 0 knows which physical
> > > > > CPU a vcpu is running on.  This does for nested guests what L0 already
> > > > > did for its own guests, which is to flush the TLB on a pCPU when it
> > > > > goes to run a vCPU there, and there is another vCPU in the same VM
> > > > > which previously ran on this pCPU and has now started to run on another
> > > > > pCPU.  This is to handle the situation where the other vCPU touched
> > > > > a mapping, moved to another pCPU and did a tlbiel (local-only tlbie)
> > > > > on that new pCPU and thus left behind a stale TLB entry on this pCPU.
> > > > > 
> > > > > This introduces a limit on the the vcpu_token values used in the
> > > > > H_ENTER_NESTED hcall -- they must now be less than NR_CPUS.
> > > > 
> > > > This does make the vcpu tokens no longer entirely opaque to the L0.
> > > > It works for now, because the only L1 is Linux and we know basically
> > > > how it allocates those tokens.  Eventually we probably want some way
> > > > to either remove this restriction or to advertise the limit to the L1.
> > > 
> > > Right, we could use something like a hash table and have it be
> > > basically just as efficient as the array when the set of IDs is dense
> > > while also handling arbitrary ID values.  (We'd have to make sure that
> > > L1 couldn't trigger unbounded memory consumption in L0, though.)
> > 
> > Another approach would be to sacifice some performance for L0
> > simplicity:  when an L1 vCPU changes pCPU, flush all the nested LPIDs
> > associated with that L1.  When an L2 vCPU changes L1 vCPU (and
> > therefore, indirectly pCPU), the L1 would be responsible for flushing
> > it.
> 
> That was one of the approaches I considered initially, but it has
> complexities that aren't apparent, and it could be quite inefficient
> for a guest with a lot of nested guests.  For a start you have to
> provide a way for L1 to flush the TLB for another LPID, which guests
> can't do themselves (it's a hypervisor privileged operation).  Then
> there's the fact that it's not the pCPU where the moving vCPU has
> moved to that needs the flush, it's the pCPU that it moved from (where
> presumably something else is now running).  All in all, the simplest
> solution was to have L0 do it, because L0 knows unambiguously the real
> physical CPU where any given vCPU last ran.

Ah, I see.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
Attachment:
signature.asc

Description: PGP signature