Hello Petr, On Wed, Dec 09, 2015 at 05:19:59PM +0100, Petr Holasek wrote: > Hi Andrea, > > I've been running stress tests against this patchset for a couple of hours > and everything was ok. However, I've allocated ~1TB of memory and got > following lockup during disabling KSM with 'echo 2 > /sys/kernel/mm/ksm/run': > > [13201.060601] INFO: task ksmd:351 blocked for more than 120 seconds. > [13201.066812] Not tainted 4.4.0-rc4+ #5 > [13201.070996] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [13201.078830] ksmd D ffff883f65eb7dc8 0 351 2 > 0x00000000 > [13201.085903] ffff883f65eb7dc8 ffff887f66e26400 ffff883f65d5e400 > ffff883f65eb8000 > [13201.093343] ffffffff81a65144 ffff883f65d5e400 00000000ffffffff > ffffffff81a65148 > [13201.100792] ffff883f65eb7de0 ffffffff816907e5 ffffffff81a65140 > ffff883f65eb7df0 > [13201.108242] Call Trace: > [13201.110708] [<ffffffff816907e5>] schedule+0x35/0x80 > [13201.115676] [<ffffffff81690ace>] schedule_preempt_disabled+0xe/0x10 > [13201.122044] [<ffffffff81692524>] __mutex_lock_slowpath+0xb4/0x130 > [13201.128237] [<ffffffff816925bf>] mutex_lock+0x1f/0x2f > [13201.133395] [<ffffffff811debd2>] ksm_scan_thread+0x62/0x1f0 > [13201.139068] [<ffffffff810c8ac0>] ? wait_woken+0x80/0x80 > [13201.144391] [<ffffffff811deb70>] ? ksm_do_scan+0x1140/0x1140 > [13201.150164] [<ffffffff810a4378>] kthread+0xd8/0xf0 > [13201.155056] [<ffffffff810a42a0>] ? kthread_park+0x60/0x60 > [13201.160551] [<ffffffff8169460f>] ret_from_fork+0x3f/0x70 > [13201.165961] [<ffffffff810a42a0>] ? kthread_park+0x60/0x60 > > It seems this is not connected with the new code, but it would be nice to > also make unmerge_and_remove_all_rmap_items() more scheduler friendly. Agreed. I run echo 2 many times here with big stable_node chains but this one never happened here, it likely shows easier on the 1TiB. It was most certainly the teardown of an enormous stable_node chain, while at it I also added one more cond_resched() in the echo 2 slow path to make the vma list walk more schedule friendly (even thought it would never end in softlockup in practice, but max_map_count can be increased via sysctl so it's safer and worth it considering how slow is that path). >From 85f2be622188d82bd1c920dfe71c3134d1f46a6d Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli <aarcange@xxxxxxxxxx> Date: Wed, 9 Dec 2015 17:53:31 +0100 Subject: [PATCH 1/1] ksm: add reschedule points to unmerge_and_remove_all_rmap_items "echo 2 >/sys/kernel/mm/ksm/run" wasn't schedule friendly due the lack of these reschedule points. unmerge_and_remove_all_rmap_items() can run into thousands of vmas that aren't VM_MERGEABLE. remove_stable_node_chain() can have an unlimited number of stable_node dups linked into the stable_node chain to free. Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> --- mm/ksm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/ksm.c b/mm/ksm.c index 47fbcfc..c7249f66 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -883,6 +883,7 @@ static int remove_stable_node_chain(struct stable_node *stable_node, VM_BUG_ON(!is_stable_node_dup(dup)); if (remove_stable_node(dup)) return true; + cond_resched(); } BUG_ON(!hlist_empty(&stable_node->hlist)); free_stable_node_chain(stable_node, root); @@ -934,6 +935,7 @@ static int unmerge_and_remove_all_rmap_items(void) mm = mm_slot->mm; down_read(&mm->mmap_sem); for (vma = mm->mmap; vma; vma = vma->vm_next) { + cond_resched(); if (ksm_test_exit(mm)) break; if (!(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>