On Tue 15-09-20 16:35:45, Linus Torvalds wrote: > On Tue, Sep 15, 2020 at 12:56 PM Matthieu Baerts > <matthieu.baerts@xxxxxxxxxxxx> wrote: > > > > I am sorry, I am not sure how to verify this. I guess it was one > > processor because I removed "-smp 2" option from qemu. So I guess it > > switched to a uniprocessor mode. > > Ok, that all sounds fine. So yes, your problem happens even with just > one CPU, and it's not any subtle SMP race. > > Which is all good - apart from the bug existing in the first place, of > course. It just reinforces the "it's probably a latent deadlock" > thing. So from the traces another theory that appeared to me is that it could be a "missed wakeup" problem. Looking at the code in wait_on_page_bit_common() I found one suspicious thing (which isn't a great match because the problem seems to happen on UP as well and I think it's mostly a theoretical issue but still I'll write it here): wait_on_page_bit_common() has: spin_lock_irq(&q->lock); SetPageWaiters(page); if (!trylock_page_bit_common(page, bit_nr, wait)) - which expands to: ( if (wait->flags & WQ_FLAG_EXCLUSIVE) { if (test_and_set_bit(bit_nr, &page->flags)) return false; } else if (test_bit(bit_nr, &page->flags)) return false; ) __add_wait_queue_entry_tail(q, wait); spin_unlock_irq(&q->lock); Now the suspicious thing is the ordering here. What prevents the compiler (or the CPU for that matter) from reordering SetPageWaiters() call behind the __add_wait_queue_entry_tail() call? I know SetPageWaiters() and test_and_set_bit() operate on the same long but is it really guaranteed something doesn't reorder these? In unlock_page() we have: if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags)) wake_up_page_bit(page, PG_locked); So if the reordering happens, clear_bit_unlock_is_negative_byte() could return false even though we have a waiter queued. And this seems to be a thing commit 2a9127fcf22 ("mm: rewrite wait_on_page_bit_common() logic") introduced because before we had set_current_state() between SetPageWaiters() and test_bit() which implies a memory barrier. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR