On Tue 21-07-20 08:33:33, Linus Torvalds wrote: > On Mon, Jul 20, 2020 at 11:33 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > The lockup is in page_unlock in do_read_fault and I suspect that this is > > yet another effect of a very long waitqueue chain which has been > > addresses by 11a19c7b099f ("sched/wait: Introduce wakeup boomark in > > wake_up_page_bit") previously. > > Hmm. > > I do not believe that you can actually get to the point where you have > a million waiters and it takes 20+ seconds to wake everybody up. I was really suprised as well! > More likely, it's actually *caused* by that commit 11a19c7b099f, and > what might be happening is that other CPU's are just adding new > waiters to the list *while* we're waking things up, because somebody > else already got the page lock again. > > Humor me.. Does something like this work instead? It's > whitespace-damaged because of just a cut-and-paste, but it's entirely > untested, and I haven't really thought about any memory ordering > issues, but I think it's ok. > > The logic is that anybody who called wake_up_page_bit() _must_ have > cleared that bit before that. So if we ever see it set again (and > memory ordering doesn't matter), then clearly somebody else got access > to the page bit (whichever it was), and we should not > > (a) waste time waking up people who can't get the bit anyway > > (b) be in a livelock where other CPU's continually add themselves to > the wait queue because somebody else got the bit. > > and it's that (b) case that I think happens for you. > > NOTE! Totally UNTESTED patch follows. I think it's good, but maybe > somebody sees some problem with this approach? I can ask them to give it a try. -- Michal Hocko SUSE Labs