Quoting Daniel Vetter (2019-05-09 13:09:03) > console_trylock, called from within printk, can be called from pretty > much anywhere. Including try_to_wake_up. Note that this isn't common, > usually the box is in pretty bad shape at that point already. But it > really doesn't help when then lockdep jumps in and spams the logs, > potentially obscuring the real backtrace we're really interested in. > One case I've seen (slightly simplified backtrace): > > Call Trace: > <IRQ> > console_trylock+0xe/0x60 > vprintk_emit+0xf1/0x320 > printk+0x4d/0x69 > __warn_printk+0x46/0x90 > native_smp_send_reschedule+0x2f/0x40 > check_preempt_curr+0x81/0xa0 > ttwu_do_wakeup+0x14/0x220 > try_to_wake_up+0x218/0x5f0 > pollwake+0x6f/0x90 > credit_entropy_bits+0x204/0x310 > add_interrupt_randomness+0x18f/0x210 > handle_irq+0x67/0x160 > do_IRQ+0x5e/0x130 > common_interrupt+0xf/0xf > </IRQ> > > This alone isn't a problem, but the spinlock in the semaphore is also > still held while waking up waiters (up() -> __up() -> try_to_wake_up() > callchain), which then closes the runqueue vs. semaphore.lock loop, > and upsets lockdep, which issues a circular locking splat to dmesg. > Worse it upsets developers, since we don't want to spam dmesg with > clutter when the machine is dying already. > > Fix this by creating a prinkt_safe_up() which calls wake_up_process > outside of the spinlock. This isn't correct in full generality, but > good enough for console_lock: > > - console_lock doesn't use interruptible or killable or timeout down() > calls, hence an up() is the only thing that can wake up a process. > Hence the process can't get woken and killed and reaped while we try > to wake it up too. > > - semaphore.c always updates the waiter list while under the spinlock, > so there's no other races. Specifically another process that races > with a quick console_lock/unlock while we've dropped the spinlock > already won't see our own waiter. > > Note that we only have to break the recursion for the semaphore.lock > spinlock of the console_lock. Recursion within various scheduler > related locks is already prevented by the printk_safe_enter/exit pair > in __up_console_sem(). > > Also cc'ing John Ogness since perhaps his printk rework fixes this all > properly. > > v2: Ditch attempt to fix console_trylock. > > v3: Add a comment explaining why the taks we're waking won't > disappear (Chris), and improve commit message to address review > questions. > > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > Cc: Will Deacon <will.deacon@xxxxxxx> > Cc: Petr Mladek <pmladek@xxxxxxxx> > Cc: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > Cc: John Ogness <john.ogness@xxxxxxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx> I'm a bit nervous about that this is only safe for the precisely controlled conditions, but then again that it is called printk_safe should deter any other users. The logic checks out, and you convinced me that the dereference is protected, so Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx