The patch titled mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined has been added to the -mm tree. Its filename is mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find out what to do about this The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ ------------------------------------------------------ Subject: mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined From: David Rientjes <rientjes@xxxxxxxxxx> When memory is hot-removed, its node must be cleared in N_HIGH_MEMORY if there are no present pages left. In such a situation, kswapd must also be stopped since it has nothing left to do. Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@xxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> Cc: Yasunori Goto <y-goto@xxxxxxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Rafael J. Wysocki <rjw@xxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxx> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Cc: Lee Schermerhorn <lee.schermerhorn@xxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Randy Dunlap <randy.dunlap@xxxxxxxxxx> Cc: Nishanth Aravamudan <nacc@xxxxxxxxxx> Cc: Andi Kleen <andi@xxxxxxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Adam Litke <agl@xxxxxxxxxx> Cc: Andy Whitcroft <apw@xxxxxxxxxxxxx> Cc: Eric Whitney <eric.whitney@xxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- diff -puN include/linux/swap.h~mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined include/linux/swap.h --- a/include/linux/swap.h~mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined +++ a/include/linux/swap.h @@ -273,6 +273,7 @@ extern int scan_unevictable_register_nod extern void scan_unevictable_unregister_node(struct node *node); extern int kswapd_run(int nid); +extern void kswapd_stop(int nid); #ifdef CONFIG_MMU /* linux/mm/shmem.c */ diff -puN mm/memory_hotplug.c~mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined mm/memory_hotplug.c --- a/mm/memory_hotplug.c~mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined +++ a/mm/memory_hotplug.c @@ -842,6 +842,10 @@ repeat: setup_per_zone_wmarks(); calculate_zone_inactive_ratio(zone); + if (!node_present_pages(node)) { + node_clear_state(node, N_HIGH_MEMORY); + kswapd_stop(node); + } vm_total_pages = nr_free_pagecache_pages(); writeback_set_ratelimit(); diff -puN mm/vmscan.c~mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined mm/vmscan.c --- a/mm/vmscan.c~mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined +++ a/mm/vmscan.c @@ -2167,6 +2167,7 @@ static int kswapd(void *p) order = 0; for ( ; ; ) { unsigned long new_order; + int ret; prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE); new_order = pgdat->kswapd_max_order; @@ -2178,19 +2179,23 @@ static int kswapd(void *p) */ order = new_order; } else { - if (!freezing(current)) + if (!freezing(current) && !kthread_should_stop()) schedule(); order = pgdat->kswapd_max_order; } finish_wait(&pgdat->kswapd_wait, &wait); - if (!try_to_freeze()) { - /* We can speed up thawing tasks if we don't call - * balance_pgdat after returning from the refrigerator - */ + ret = try_to_freeze(); + if (kthread_should_stop()) + break; + + /* + * We can speed up thawing tasks if we don't call balance_pgdat + * after returning from the refrigerator + */ + if (!ret) balance_pgdat(pgdat, order); - } } return 0; } @@ -2445,6 +2450,17 @@ int kswapd_run(int nid) return ret; } +/* + * Called by memory hotplug when all memory in a node is offlined. + */ +void kswapd_stop(int nid) +{ + struct task_struct *kswapd = NODE_DATA(nid)->kswapd; + + if (kswapd) + kthread_stop(kswapd); +} + static int __init kswapd_init(void) { int nid; _ Patches currently in -mm which might be from rientjes@xxxxxxxxxx are linux-next.patch revert-mm-oom-analysis-add-buffer-cache-information-to-show_free_areas.patch oom-dump-stack-and-vm-state-when-oom-killer-panics.patch nodemask-make-nodemask_alloc-more-general.patch hugetlb-rework-hstate_next_node_-functions.patch hugetlb-add-nodemask-arg-to-huge-page-alloc-free-and-surplus-adjust-functions.patch hugetlb-add-nodemask-arg-to-huge-page-alloc-free-and-surplus-adjust-functions-fix.patch hugetlb-factor-init_nodemask_of_node.patch hugetlb-derive-huge-pages-nodes-allowed-from-task-mempolicy.patch hugetlb-add-generic-definition-of-numa_no_node.patch hugetlb-add-per-node-hstate-attributes.patch hugetlb-update-hugetlb-documentation-for-numa-controls.patch hugetlb-use-only-nodes-with-memory-for-huge-pages.patch mm-clear-node-in-n_high_memory-and-stop-kswapd-when-all-memory-is-offlined.patch hugetlb-handle-memory-hot-plug-events.patch hugetlb-offload-per-node-attribute-registrations.patch mm-add-gfp-flags-for-nodemask_alloc-slab-allocations.patch do_wait-optimization-do-not-place-sub-threads-on-task_struct-children-list.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html