On Mon, Jul 20, 2020 at 11:33 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > The lockup is in page_unlock in do_read_fault and I suspect that this is > yet another effect of a very long waitqueue chain which has been > addresses by 11a19c7b099f ("sched/wait: Introduce wakeup boomark in > wake_up_page_bit") previously. Hmm. I do not believe that you can actually get to the point where you have a million waiters and it takes 20+ seconds to wake everybody up. More likely, it's actually *caused* by that commit 11a19c7b099f, and what might be happening is that other CPU's are just adding new waiters to the list *while* we're waking things up, because somebody else already got the page lock again. Humor me.. Does something like this work instead? It's whitespace-damaged because of just a cut-and-paste, but it's entirely untested, and I haven't really thought about any memory ordering issues, but I think it's ok. The logic is that anybody who called wake_up_page_bit() _must_ have cleared that bit before that. So if we ever see it set again (and memory ordering doesn't matter), then clearly somebody else got access to the page bit (whichever it was), and we should not (a) waste time waking up people who can't get the bit anyway (b) be in a livelock where other CPU's continually add themselves to the wait queue because somebody else got the bit. and it's that (b) case that I think happens for you. NOTE! Totally UNTESTED patch follows. I think it's good, but maybe somebody sees some problem with this approach? I realize that people can wait for other bits than the unlocked, and if you're waiting for writeback to complete maybe you don't care if somebody else then started writeback *AGAIN* on the page and you'd actually want to be woken up regardless, but honestly, I don't think it really matters. Linus --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1054,6 +1054,15 @@ static void wake_up_page_bit(struct page *page, int bit_nr) * from wait queue */ spin_unlock_irqrestore(&q->lock, flags); + + /* + * If somebody else set the bit again, stop waking + * people up. It's now the responsibility of that + * other page bit owner to do so. + */ + if (test_bit(bit_nr, &page->flags)) + return; + cpu_relax(); spin_lock_irqsave(&q->lock, flags); __wake_up_locked_key_bookmark(q, TASK_NORMAL, &key, &bookmark);