From: Michal Hocko <mhocko@xxxxxxxx> We have seen a bug report with huge number of soft lockups during the system boot on !PREEMPT kernel NMI watchdog: BUG: soft lockup - CPU#1291 stuck for 22s! [systemd-udevd:43283] [...] NIP [c00000000094e66c] _raw_spin_lock_irqsave+0xac/0x100 LR [c00000000094e654] _raw_spin_lock_irqsave+0x94/0x100 Call Trace: [c00002293d883b30] [c0000000012cdee8] __pmd_index_size+0x0/0x8 (unreliable) [c00002293d883b70] [c0000000002b7490] wake_up_page_bit+0xc0/0x150 [c00002293d883bf0] [c00000000030ceb8] do_fault+0x448/0x870 [c00002293d883c40] [c000000000310080] __handle_mm_fault+0x880/0x16b0 [c00002293d883d10] [c000000000311018] handle_mm_fault+0x168/0x250 [c00002293d883d50] [c00000000006c488] do_page_fault+0x568/0x8d0 [c00002293d883e30] [c00000000000a534] handle_page_fault+0x18/0x38 on a large ppc machine. The very likely cause is a suboptimal configuration when systed-udev spawns way too many workders to bring the system up. The lockup is in page_unlock in do_read_fault and I suspect that this is yet another effect of a very long waitqueue chain which has been addresses by 11a19c7b099f ("sched/wait: Introduce wakeup boomark in wake_up_page_bit") previously. The commit primarily aimed at hard lockup prevention but it doesn't really help !PREEMPT case which still has to process all the work without any rescheduling point. This is however not really trivial because page_unlock is called from many contexts many of which are likely called from an atomic context. Introducing page_unlock_sleepable is certainly an option but it seems like a hard to maintain option which doesn't really fix the underlying problem as the same might happen from other unlock_page callers. This patch doesn't address the underlying problem but it reduces the visible effect. Tell the soft lockup to shut up when retrying batches on queued waiters. This will also allow systems configured to panic on warning to proceed with the boot. Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> --- mm/filemap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 385759c4ce4b..74681c40a6e5 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -41,6 +41,7 @@ #include <linux/delayacct.h> #include <linux/psi.h> #include <linux/ramfs.h> +#include <linux/nmi.h> #include "internal.h" #define CREATE_TRACE_POINTS @@ -1055,6 +1056,7 @@ static void wake_up_page_bit(struct page *page, int bit_nr) */ spin_unlock_irqrestore(&q->lock, flags); cpu_relax(); + touch_softlockup_watchdog(); spin_lock_irqsave(&q->lock, flags); __wake_up_locked_key_bookmark(q, TASK_NORMAL, &key, &bookmark); } -- 2.27.0