On Wed, 23 Jul 2014, Alex Thorlton wrote: > > It's also been a long-standing issue that cpusets and mempolicies are > > ignored by khugepaged that allows memory to be migrated remotely to nodes > > that are not allowed by a cpuset's mems or a mempolicy's nodemask. Even > > with this issue fixed, you may find that some memory is migrated remotely, > > although it may be negligible, by khugepaged. > > A bit here and there is manageable. There is, of course, some work to > be done there, but for now we're mainly concerned with a job that's > supposed to be confined to a cpuset spilling out and soaking up all the > memory on a machine. > You may find my patch[*] in -mm to be helpful if you enable zone_reclaim_mode. It changes khugepaged so that it is not allowed to migrate any memory to a remote node where the distance between the nodes is greater than RECLAIM_DISTANCE. These issues are still pending and we've encountered a couple of them in the past weeks ourselves. The definition of RECLAIM_DISTANCE, currently at 30 for x86, is relying on the SLIT to define when remote access is costly and there are cases where people need to alter the BIOS to workaround this definition. We can hope that NUMA balancing will solve a lot of these problems for us, but there's always a chance that the VM does something totally wrong which you've undoubtedly encountered already. [*] http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-only-collapse-hugepages-to-nodes-with-affinity-for-zone_reclaim_mode.patch -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>