On Thu 2019-05-02 16:16:43, Daniel Vetter wrote: > console_trylock, called from within printk, can be called from pretty > much anywhere. Including try_to_wake_up. Note that this isn't common, > usually the box is in pretty bad shape at that point already. But it > really doesn't help when then lockdep jumps in and spams the logs, > potentially obscuring the real backtrace we're really interested in. > One case I've seen (slightly simplified backtrace): > > Call Trace: > <IRQ> > console_trylock+0xe/0x60 > vprintk_emit+0xf1/0x320 > printk+0x4d/0x69 > __warn_printk+0x46/0x90 > native_smp_send_reschedule+0x2f/0x40 > check_preempt_curr+0x81/0xa0 > ttwu_do_wakeup+0x14/0x220 > try_to_wake_up+0x218/0x5f0 > pollwake+0x6f/0x90 > credit_entropy_bits+0x204/0x310 > add_interrupt_randomness+0x18f/0x210 > handle_irq+0x67/0x160 > do_IRQ+0x5e/0x130 > common_interrupt+0xf/0xf > </IRQ> > > This alone isn't a problem, but the spinlock in the semaphore is also > still held while waking up waiters (up() -> __up() -> try_to_wake_up() > callchain), which then closes the runqueue vs. semaphore.lock loop, > and upsets lockdep, which issues a circular locking splat to dmesg. > Worse it upsets developers, since we don't want to spam dmesg with > clutter when the machine is dying already. > > Fix this by creating a __down_trylock which only trylocks the > semaphore.lock. This isn't correct in full generality, but good enough > for console_lock: > > - there's only ever one console_lock holder, we won't fail spuriously > because someone is doing a down() or up() while there's still room > (unlike other semaphores with count > 1). > > - console_unlock() has one massive retry loop, which will catch anyone > who races the trylock against the up(). This makes sure that no > printk lines will get lost. Making the trylock more racy therefore > has no further impact. To be honest, I do not see how this could solve the problem. The circular dependency is still there. If the new __down_trylock() succeeds then console_unlock() will get called in the same context and it will still need to call up() -> try_to_wake_up(). Note that there are many other console_lock() callers that might happen in parallel and might appear in the wait queue. Best Regards, Petr _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx