Re: [PATCH] Map guest (ts,tid) to individual shadow tid

Hollis Blanchard <hollisb@xxxxxxxxxx> · Wed, 8 Apr 2009 11:42:55 -0500

On Tuesday 07 April 2009 21:11:11 Liu Yu-B13201 wrote:
> 
> > -----Original Message-----
> > From: Hollis Blanchard [mailto:hollisb@xxxxxxxxxx] 
> > Sent: Tuesday, April 07, 2009 11:41 PM
> > To: Liu Yu-B13201
> > Cc: kvm-ppc@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH] Map guest (ts,tid) to individual shadow tid
> > 
> > On Tue, 2009-04-07 at 17:51 +0800, Liu Yu wrote:
> > > Hi guys,
> > > 
> > > Thanks to Hollis's idea.
> > > This patch help to make KVMPPC be more friendly to OSes 
> > other than Linux.
> > 
> > Wow, that was quick. :)
> > 
> > Have you tested this code? Doesn't this break the assumption in
> > kvmppc_e500_tlb1_invalidate() (and probably elsewhere) that stid ==
> > gtid?
> 
> Yes, have taken a simple test.

Once we can reduce the number of TLB flushes (see below) it will be 
interesting to see if there's a performance impact.

>Good catch, it needs to handle it here in kvmppc_e500_tlb1_invalidate().
>Thanks.
>But it's ok for now, because TLB1 only contains tid=0 entries, and tid=0
>always be mapped to stid=0.
>That's why the test is fine...

OK. Still, it makes me nervous to break such a simple assumption. We should 
introduce nice accessors to make it difficult to code it wrong in the future. 
I'm honestly surprised that kvmppc_e500_tlb1_invalidate() is the only affected 
site.

> > > Any (ts,tid) combination will be mapped to an individual 
> > shadow tid(stid).
> > > Whenever the stid is exhausted, it needs to reset of stid 
> > and flush host TLB.
> > 
> > What I was thinking was that KVM would participate in the 
> > host's normal
> > PID allocation policy. Specifically, guest activity could 
> > cause the host
> > to invoke steal_context_up/smp() (see switch_mmu_context() for
> > reference).
> 
> Host only use AS=0 space and the context only have 255 PIDs.
> But we use AS=1 space then we have another 255 PIDs.
> If we participate in host, how could we utilize the PIDs under condition
> AS=1 without consume host PIDs?

I'm suggesting we not use AS=0. We can't use AS=0 and AS=1 PIDs 
interchangeably anyways.

> > When you do that, there is no need for KVM to explicitly flush the TLB
> > ever -- it's done as needed by steal_context*(). That's 
> > actually better
> > than what we do today, because today we must flush all guest mappings
> > every time we de-schedule a guest. In the new scheme, when 
> > (guest PIDs +
> > host PIDs) < 255, we would never need to flush the TLB due to PID
> > pressure.
> 
> No. we only need to flush TLB when additional 255 PIDs are exhausted.
> And it's nothing to do with host PIDs.
> For Linux running in guest, PIDs won't be exhausted so no TLB flush is
> needed.

I think this isn't true when using normal (AS=0) host PIDs.

That said, I guess throwing away the AS trick on Linux hosts wouldn't be a 
great idea. Since it's available (on Book E), we might as well take advantage 
of it.

So the only question then is about supporting *guest* use of AS, which I see 
now is what your patch is about. The one big change I would make is to have 
the AS=1 PID allocator be host-global rather than per-vcpu. That way we can 
avoid the unnecessary TLB flushes on every guest deschedule I mentioned 
before. (Imagine guest A using two host AS=1 PIDs, and guest B using five: we 
never need to flush the TLB due to PID pressure.)

That would also put the new PID management code into booke code instead of 
e500 code.

> > (Also, rather than an array, I would have created a hash table using
> > include/linux/hash.h and include/linux/list.h. I haven't thought it
> > through all the way though... I mean the tradeoffs of an array vs hash
> > table here.)
> > 
> 
> Hmm. What do you think we can benefit from hash table?

Lower memory consumption. You know it's going to be sparse, since not every 
guest AS|PID will have a corresponding host PID. However, 512 bytes isn't bad, 
especially if it's shared by all vcpus. Don't worry about it.

-- 
Hollis Blanchard
IBM Linux Technology Center
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html