[RFC] Migrate-on-fault a.k.a Lazy Page Migration At the Linux Plumber's conference, Andi Kleen encouraged me again to resubmit my automatic page migration patches because he thinks they will be useful for virtualization. Later, in the Virtualization mini-conf, the subject came up during a presentation about adding NUMA awareness to qemu/kvm. After the presentation, I discussed these series with Andrea Arcangeli and he also encouraged me to post them. My position within HP has changed such that I'm not sure how much time I'll have to spend on this area nor whether I'll have access to the larger NUMA platforms on which to test the patches thoroughly. However, here is the second of 4 series that comprise my shared policy enhancements and lazy/auto-migration enhancement. I have rebased the patches against a recent mmotm tree. This rebase built cleanly, booted and passed a few ad hoc tests on x86_64. I've made a pass over the patch descriptions to update them. If there is sufficient interest in merging this, I'll do what I can to assist in the completion and testing of the series. Based atop the previously posted: 1) Shared policy cleanup, fixes, mapped file policy To follow: 3) Auto [as in "self"] migration facility 4) a Migration Cache -- originally written by Marcello Tosatti I'll announce this series and the automatic/lazy migration series to follow on lkml, linux-mm, ... However, I'll limit the actual posting to linux-numa to avoid spamming the other lists. --- This series of patches implements page migration in the fault path. !!! N.B., Need to consider iteraction with KSM and Transparent Huge !!! Pages. The basic idea is that when a fault handler such as do_swap_page() finds a cached page with zero mappings that is otherwise "stable"-- e.g., no I/O in progress--this is a good opportunity to check whether the page resides on the node indicated by the mempolicy in the current context. We only attempt to migrate when there are zero mappings because 1) we can easily migrate the page--don't have to go through the effort of removing all mappings and 2) default policy--a common case--can give different answers from different tasks running on different nodes. Checking the policy when there are zero mappings effectively implements a "first touch" placement policy. Note that this mechanism could be used to migrate page cache pages that were read in earlier, are no longer referenced, but are about to be used by a new task on another node from where the page resides. The same mechanism can be used to pull anon pages along with a task when the load balancer decides to move it to another node. However, that will require a bit more mechanism, and is the subject of another patch series. The kernel's direct migration facility support most of the mechanism that is required to implement this "migration on fault". Some changes were needed to the migratepage op functions to behave appropriately when called from the fault path. Then we need to add the function[s] to test the current page in the fault path for zero mapping, no writebacks, misplacement, ...; and the function[s] to acutally migrate the page contents to a newly allocated page using the [modified] migratepage address space operations of the direct migration mechanism. This series used to include patches to migrate cached file pages and shmem pages. Testing with, e.g., kernel builds, showed a great deal of thrashing of page cache pages, so those patches have been removed. I think page replication would be a better approach for shared, read-only pages. Nick Piggin created such a patch quite a while back and I had integrated it with automigration series. Those patches have since gone stale. --- Lee Schermerhorn -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html