On 07/06/2012 04:04 PM, Lee Schermerhorn wrote:
On Fri, 2012-07-06 at 12:38 -0400, Rik van Riel wrote:
4. Putting a lot of pages in the swap cache ends up allocating swap space. This means this NUMA migration scheme will only work on systems that have a substantial amount of memory represented by swap space. This is highly unlikely on systems with memory in the TB range. On smaller systems, it could drive the system out of memory (to the OOM killer), by "filling up" the overflow swap with migration pages instead. 5. In the long run, we want the ability to migrate transparent huge pages as one unit. The reason is simple, the performance penalty for running on the wrong NUMA node (10-20%) is on the same order of magnitude as the performance penalty for running with 4kB pages instead of 2MB pages (5-15%). Breaking up large pages into small ones, and having khugepaged reconstitute them on a random NUMA node later on, will negate the performance benefits of both NUMA placement and THP.
When I originally posted the "migrate on fault" series, I posted a separate series with a "migration cache" to avoid the use of swap space for lazy migration: http://markmail.org/message/xgvvrnn2nk4nsn2e. The migration cache was originally implemented by Marcello Tosatti for the old memory hotplug project: http://marc.info/?l=linux-mm&m=109779128211239&w=4. The idea is that you don't need swap space for lazy migration, just an "address_space" where you can park an anon VMA's pte's while they're "unmapped" to cause migration faults. Based on a suggestion from Christoph Lameter, I had tried to hide the migration cache behind the swap cache interface to minimize changes mainly in do_swap_page and vmscan/reclaim. It seemed to work, but the difference in reference count semantics for the mig cache -- entry removed when last pte migrated/mapped -- makes coordination with exit teardown, uh, tricky.
That fixes one of the two problems, but using _PTE_NUMA or _PAGE_PROTNONE looks like it would be both easier, and solve both. -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>