While running a CPU hotplug stress test under memory pressure, it was observed that the machine would halt with no messages logged to console. This is difficult to trigger and required a machine with 8 cores and plenty of memory. Part of the problem is the page allocator is sending IPIs using on_each_cpu() without calling get_online_cpus() to prevent changes to the online cpumask. This allows IPIs to be send to CPUs that are going offline or offline already. At least one bug report has been seen on ppc64 against a 3.0 era kernel that looked like a bug receiving interrupts on a CPU being offlined. This patch starts by adding a call to get_online_cpus() to drain_all_pages() to make it safe versis CPU hotplug. In the context of the page allocator, this causes a problem. It is possible that kthreadd blocks on cpu_hotplug mutex while another process already holding the mutex is blocked waiting for kthreadd to make forward progress leading to deadlock. Additionally, it is important that cpu_hotplug mutex does not become a new hot lock while under pressure. There is also the consideration that CPU hotplug expects that get_online_cpus() is not called frequently as it can lead to livelock in exceptional circumstances (see comment above cpu_hotplug_begin()). Rather than making it safe to call get_online_cpus() from the page allocator, this patch simply removes the page allocator call to drain_all_pages(). To avoid impacting high-order allocation success rates, it still drains the local per-cpu lists for high-order allocations that failed. As a side effect, this reduces the number of IPIs sent during low memory situations. Signed-off-by: Mel Gorman <mgorman@xxxxxxx> --- mm/page_alloc.c | 16 ++++++++++++---- 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2b8ba3a..b6df6fc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1119,7 +1119,9 @@ void drain_local_pages(void *arg) */ void drain_all_pages(void) { + get_online_cpus(); on_each_cpu(drain_local_pages, NULL, 1); + put_online_cpus(); } #ifdef CONFIG_HIBERNATION @@ -1982,11 +1984,17 @@ retry: migratetype); /* - * If an allocation failed after direct reclaim, it could be because - * pages are pinned on the per-cpu lists. Drain them and try again + * If a high-order allocation failed after direct reclaim, there is a + * possibility that it is because the necessary buddies have been + * freed to the per-cpu list. Drain the local list and try again. + * drain_all_pages is not used because it is unsafe to call + * get_online_cpus from this context as it is possible that kthreadd + * would block during thread creation and the cost of sending storms + * of IPIs in low memory conditions is quite high. */ - if (!page && !drained) { - drain_all_pages(); + if (!page && order && !drained) { + drain_pages(get_cpu()); + put_cpu(); drained = true; goto retry; } -- 1.7.3.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>