On 26.11.2012, at 22:48, Paul Mackerras wrote: > On Mon, Nov 26, 2012 at 02:10:33PM +0100, Alexander Graf wrote: >> >> On 23.11.2012, at 23:07, Paul Mackerras wrote: >> >>> On Fri, Nov 23, 2012 at 04:43:03PM +0100, Alexander Graf wrote: >>>> >>>> On 22.11.2012, at 10:28, Paul Mackerras wrote: >>>> >>>>> - With the possibility of the host paging out guest pages, the use of >>>>> H_LOCAL by an SMP guest is dangerous since the guest could possibly >>>>> retain and use a stale TLB entry pointing to a page that had been >>>>> removed from the guest. >>>> >>>> I don't understand this part. Don't we flush the TLB when the page gets evicted from the shadow HTAB? >>> >>> The H_LOCAL flag is something that we invented to allow the guest to >>> tell the host "I only ever used this translation (HPTE) on the current >>> vcpu" when it's removing or modifying an HPTE. The idea is that that >>> would then let the host use the tlbiel instruction (local TLB >>> invalidate) rather than the usual global tlbie instruction. Tlbiel is >>> faster because it doesn't need to go out on the fabric and get >>> processed by all cpus. In fact our guests don't use it at present, >>> but we put it in because we thought we should be able to get a >>> performance improvement, particularly on large machines. >>> >>> However, the catch is that the guest's setting of H_LOCAL might be >>> incorrect, in which case we could have a stale TLB entry on another >>> physical cpu. While the physical page that it refers to is still >>> owned by the guest, that stale entry doesn't matter from the host's >>> point of view. But if the host wants to take that page away from the >>> guest, the stale entry becomes a problem. >> >> That's exactly where my question lies. Does that mean we don't flush the TLB entry regardless when we take the page away from the guest? > > The question is how to find the TLB entry if the HPTE it came from is > no longer present. Flushing a TLB entry requires a virtual address. > When we're taking a page away from the guest we have the real address > of the page, not the virtual address. We can use the reverse-mapping > chains to loop through all the HPTEs that map the page, and from each > HPTE we can (and do) calculate a virtual address and do a TLBIE on > that virtual address (each HPTE could be at a different virtual > address). > > The difficulty comes when we no longer have the HPTE but we > potentially have a stale TLB entry, due to having used tlbiel when we > removed the HPTE. Without the HPTE the only way to get rid of the > stale TLB entry would be to completely flush all the TLB entries for > the guest's LPID on every physical CPU it had ever run on. Since I > don't want to go to that much effort, what I am proposing, and what > this patch implements, is to not ever use tlbiel when removing HPTEs > in SMP guests on POWER7. > > In other words, what this patch is about is making sure we don't get > these troublesome stale TLB entries. I see. You could keep a list of to-be-flushed VAs around that you could skim through when taking a page away from the guest. That way you make the fast case fast (add/remove of page from the guest) and the slow path slow (paging). But I'm fine with disallowing local flushes on remove completely for now. It would be nice to get performance data on how much this would be a net win though. There are certainly ways of keeping local flushes alive with the scheme above. Thanks, applied to kvm-ppc-next. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html