On Tue, Dec 13, 2011 at 8:23 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: > On 12/13/2011 03:10 PM, Christoffer Dall wrote: >> >> the question is just which mappings are the >> >> most efficient to reclaim. >> > >> > Do you have accessed bits in those PTEs? >> > >> >> nope. We can protect the underlying target pages though, but... > > Yeah, we have the same issue with one of the vendors. Fortunately only > 90% of the market is affected. > :) >> > It's not really critical to have efficient reclaim here, since it >> > happens so rarely. It just needs to do something. >> > >> >> when would you trigger it - when it reaches a certain limit, or? And >> then what, free the lot and re-allocate what's needed? > > The kernel triggers it based on internal pressure. It tells you how > much pressure to apply, so you just translate it to a number of pages to > free. > > ok, so we pick those pages at random? (perhaps trying to avoid hitting the guest kernel at least for Linux, or...?) >> >> The other problem, the actual guest memory consuming too much memory, >> >> I assumed this limit would be set by the user when creating his/her >> >> VM, or can we do something smarter? (again, forgive my ignorance). >> >> What is the alternative to pinning actual guest pages >> > >> > mmu notifiers - pages aren't pinned; instead, Linux calls back into kvm >> > when modifying a host pte, and kvm responds by dropping or modifying its >> > translation (second stage pte in your case). >> > >> >> ah ok, so this works across VM boundary. Based on hyper-calls I presume? > > No, it's completely internal to the host. > ok, got you. I got thrown off by the "Linux calls back into kvm" statement. > See for example kvm_mmu_notifier_invalidate_page() (in common code). > It's called when Linux-as-host wants to change a pte (say to swap a > page). kvm responds by translating the host virtual address into a > guest physical address (via the memory slots table), then zapping the > relevant pte and flushing and TLBs which may have cached the pte. > >> > mmu notifiers are also useful for other optimizations, like ksm, >> > ballooning, and transparent huge pages. >> > >> >> I know those features have to be supported eventually. The question is >> if all this must be in place before a merge upstream? > > It doesn't have to be there for the merge but I recommend giving it high > priority. At least read and understand the code so the addition will > follow naturally. > will do - I will make it a Christmas activity. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html