On Fri, 3 Nov 2017 07:21:21 -0400 Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > On Thu, 2 Nov 2017 21:09:32 -0700 > John Hubbard <jhubbard@xxxxxxxxxx> wrote: > > > > For example, if there are 3 or more threads, you can do the following: > > > > thread A: holds the console lock, is printing, then moves into the console_unlock > > phase > > > > thread B: goes into the waiter spin loop above, and (once the polarity is corrected) > > waits for console_waiter to become 0 > > > > thread A: finishing up, sets console_waiter --> 0 > > > > thread C: before thread B notices, thread C goes into the "else" section, sees that > > console_waiter == 0, and sets console_waiter --> 1. So thread C now > > becomes the waiter > > But console_waiter only gets set to 1 if console_waiter is 0 *and* > console_owner is not NULL and is not current. console_owner is only > updated under a spin lock and console_waiter is only set under a spin > lock when console_owner is not NULL. > > This means this scenario can not happen. > > > > > > thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of > > console_waiter, so it continues waiting. And now we have both B > > and C in the same spin loop, and this is now broken. > > > > At the root, this is really due to the absence of a pre-existing "hand-off this lock" > > mechanism. And this one here is not quite correct. > > > > Solution ideas: for a true hand-off, there needs to be a bit more information > > exchanged. Conceptually, a (lock-protected) list of waiters (which would > > only ever have zero or one entries) is a good way to start thinking about it. > > As stated above, the console owner check will prevent this issue. > I'll condense the patch to show what I mean: To become a waiter, a task must do the following: + printk_safe_enter_irqsave(flags); + + raw_spin_lock(&console_owner_lock); + owner = READ_ONCE(console_owner); + waiter = READ_ONCE(console_waiter); + if (!waiter && owner && owner != current) { + WRITE_ONCE(console_waiter, true); + spin = true; + } + raw_spin_unlock(&console_owner_lock); The new waiter gets set only if there isn't already a waiter *and* there is an owner that is not current (and with the printk_safe_enter I don't think that is even needed). + while (!READ_ONCE(console_waiter)) + cpu_relax(); The spin is outside the spin lock. But only the owner can clear it. Now the owner is doing a loop of this (with interrupts disabled) + raw_spin_lock(&console_owner_lock); + console_owner = current; + raw_spin_unlock(&console_owner_lock); Write to consoles. + raw_spin_lock(&console_owner_lock); + waiter = READ_ONCE(console_waiter); + console_owner = NULL; + raw_spin_unlock(&console_owner_lock); + if (waiter) + break; At this moment console_owner is NULL, and no new waiters can happen. The next owner will be the waiter that is spinning. + if (waiter) { + WRITE_ONCE(console_waiter, false); There is no possibility of another task sneaking in and becoming a waiter at this moment. The console_owner was cleared under spin lock, and a waiter is only set under the same spin lock if owner is set. There will be no new owner sneaking in because to become the owner, you must have the console lock. Since it is never released between the time the owner clears console_waiter and the waiter takes the console lock, there is no race. -- Steve -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>