On Mon, 5 Oct 2009, Lee Schermerhorn wrote: > [PATCH 11/11] hugetlb: offload [un]registration of sysfs attr to worker thread > > Against: 2.6.31-mmotm-090925-1435 > > New in V6 > > V7: + remove redundant check for memory{ful|less} node from > node_hugetlb_work(). Rely on [added] return from > hugetlb_register_node() to differentiate between transitions > to/from memoryless state. > That doesn't work because the memory hotplug code doesn't clear the N_HIGH_MEMORY bit for status_change_nid on MEM_OFFLINE, so hugetlb_register_node() will always return true under such conditions. The following should fix it. Christoph? mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined When memory is hot-removed, its node must be cleared in N_HIGH_MEMORY if there are no present pages left. In such a situation, kswapd must also be stopped since it has nothing left to do. Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> Cc: Yasunori Goto <y-goto@xxxxxxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> --- include/linux/swap.h | 1 + mm/memory_hotplug.c | 4 ++++ mm/vmscan.c | 28 ++++++++++++++++++++++------ 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -273,6 +273,7 @@ extern int scan_unevictable_register_node(struct node *node); extern void scan_unevictable_unregister_node(struct node *node); extern int kswapd_run(int nid); +extern void kswapd_stop(int nid); #ifdef CONFIG_MMU /* linux/mm/shmem.c */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -838,6 +838,10 @@ repeat: setup_per_zone_wmarks(); calculate_zone_inactive_ratio(zone); + if (!node_present_pages(node)) { + node_clear_state(node, N_HIGH_MEMORY); + kswapd_stop(node); + } vm_total_pages = nr_free_pagecache_pages(); writeback_set_ratelimit(); diff --git a/mm/vmscan.c b/mm/vmscan.c --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2163,6 +2163,7 @@ static int kswapd(void *p) order = 0; for ( ; ; ) { unsigned long new_order; + int ret; prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE); new_order = pgdat->kswapd_max_order; @@ -2174,19 +2175,23 @@ static int kswapd(void *p) */ order = new_order; } else { - if (!freezing(current)) + if (!freezing(current) && !kthread_should_stop()) schedule(); order = pgdat->kswapd_max_order; } finish_wait(&pgdat->kswapd_wait, &wait); - if (!try_to_freeze()) { - /* We can speed up thawing tasks if we don't call - * balance_pgdat after returning from the refrigerator - */ + ret = try_to_freeze(); + if (kthread_should_stop()) + break; + + /* + * We can speed up thawing tasks if we don't call balance_pgdat + * after returning from the refrigerator + */ + if (!ret) balance_pgdat(pgdat, order); - } } return 0; } @@ -2441,6 +2446,17 @@ int kswapd_run(int nid) return ret; } +/* + * Called by memory hotplug when all memory in a node is offlined. + */ +void kswapd_stop(int nid) +{ + struct task_struct *kswapd = NODE_DATA(nid)->kswapd; + + if (kswapd) + kthread_stop(kswapd); +} + static int __init kswapd_init(void) { int nid; -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html