On 27.11.2012, at 00:16, Paul Mackerras wrote: > On Mon, Nov 26, 2012 at 11:03:19PM +0100, Alexander Graf wrote: >> >> On 26.11.2012, at 22:48, Paul Mackerras wrote: >> >>> On Mon, Nov 26, 2012 at 02:10:33PM +0100, Alexander Graf wrote: >>>> >>>> On 23.11.2012, at 23:07, Paul Mackerras wrote: >>>> >>>>> On Fri, Nov 23, 2012 at 04:43:03PM +0100, Alexander Graf wrote: >>>>>> >>>>>> On 22.11.2012, at 10:28, Paul Mackerras wrote: >>>>>> >>>>>>> - With the possibility of the host paging out guest pages, the use of >>>>>>> H_LOCAL by an SMP guest is dangerous since the guest could possibly >>>>>>> retain and use a stale TLB entry pointing to a page that had been >>>>>>> removed from the guest. >>>>>> >>>>>> I don't understand this part. Don't we flush the TLB when the page gets evicted from the shadow HTAB? >>>>> >>>>> The H_LOCAL flag is something that we invented to allow the guest to >>>>> tell the host "I only ever used this translation (HPTE) on the current >>>>> vcpu" when it's removing or modifying an HPTE. The idea is that that >>>>> would then let the host use the tlbiel instruction (local TLB >>>>> invalidate) rather than the usual global tlbie instruction. Tlbiel is >>>>> faster because it doesn't need to go out on the fabric and get >>>>> processed by all cpus. In fact our guests don't use it at present, >>>>> but we put it in because we thought we should be able to get a >>>>> performance improvement, particularly on large machines. >>>>> >>>>> However, the catch is that the guest's setting of H_LOCAL might be >>>>> incorrect, in which case we could have a stale TLB entry on another >>>>> physical cpu. While the physical page that it refers to is still >>>>> owned by the guest, that stale entry doesn't matter from the host's >>>>> point of view. But if the host wants to take that page away from the >>>>> guest, the stale entry becomes a problem. >>>> >>>> That's exactly where my question lies. Does that mean we don't flush the TLB entry regardless when we take the page away from the guest? >>> >>> The question is how to find the TLB entry if the HPTE it came from is >>> no longer present. Flushing a TLB entry requires a virtual address. >>> When we're taking a page away from the guest we have the real address >>> of the page, not the virtual address. We can use the reverse-mapping >>> chains to loop through all the HPTEs that map the page, and from each >>> HPTE we can (and do) calculate a virtual address and do a TLBIE on >>> that virtual address (each HPTE could be at a different virtual >>> address). >>> >>> The difficulty comes when we no longer have the HPTE but we >>> potentially have a stale TLB entry, due to having used tlbiel when we >>> removed the HPTE. Without the HPTE the only way to get rid of the >>> stale TLB entry would be to completely flush all the TLB entries for >>> the guest's LPID on every physical CPU it had ever run on. Since I >>> don't want to go to that much effort, what I am proposing, and what >>> this patch implements, is to not ever use tlbiel when removing HPTEs >>> in SMP guests on POWER7. >>> >>> In other words, what this patch is about is making sure we don't get >>> these troublesome stale TLB entries. >> >> I see. You could keep a list of to-be-flushed VAs around that you could skim through when taking a page away from the guest. That way you make the fast case fast (add/remove of page from the guest) and the slow path slow (paging). > > Yes, I thought about that, but the problem is that the list of VAs > could get arbitrarily long and take up a lot of host memory. You can always cap it at an arbitrary number, similar to how the TLB itself is limited too. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html