On Mon 2020-09-28 12:25:59, Peter Zijlstra wrote: > On Mon, Sep 28, 2020 at 06:04:23PM +0800, Chengming Zhou wrote: > > > Well, you are lucky. So it's a problem in our printk implementation. > > Not lucky; I just kicked it in the groin really hard: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git debug/experimental > > > The deadlock path is: > > > > printk > > vprintk_emit > > console_unlock > > vt_console_print > > hide_cursor > > bit_cursor > > soft_cursor > > queue_work_on > > __queue_work > > try_to_wake_up > > _raw_spin_lock > > native_queued_spin_lock_slowpath > > > > Looks like it's introduced by this commit: > > > > eaa434defaca1781fb2932c685289b610aeb8b4b > > > > "drm/fb-helper: Add fb_deferred_io support" > > Oh gawd, yeah, all the !serial consoles are utter batshit. > > Please look at John's last printk rewrite, IIRC it farms all that off to > a kernel thread instead of doing it from the printk() caller's context. > > I'm not sure where he hides his latests patches, but I'm sure he'll be > more than happy to tell you. AFAIK, John is just working on updating the patchset so that it will be based on the lockless ringbuffer that is finally in the queue for-5.10. Upstreaming the console handling will be the next big step. I am sure that there will be long discussion about it. But there might be few things that would help removing printk_deferred(). 1. Messages will be printed on consoles by dedicated kthreads. It will be safe context. No deadlocks. 2. The registration and unregistration of consoles should not longer be handled by console_lock (semaphore). It should be possible to call most consoles without a sleeping lock. It should remove all these deadlocks between printk() and scheduler(). There might be problems with some consoles. For example, tty would most likely still need a sleeping lock because it is using the console semaphore also internally. 3. We will try harder to get the messages out immediately during panic(). It would take some time until the above reaches upstream. But it seems to be the right way to go. About printk_deferred(): It is a whack a mole game. It is easy to miss printk() that might eventually cause the deadlock. printk deferred context is more safe. But it is still a what a mole game. The kthreads will do the same job for sure. Finally, the deadlock happens "only" when someone is waiting on console_lock() in parallel. Otherwise, the waitqueue for the semaphore is empty and scheduler is not called. It means that there is quite a big change to see the WARN(). It might be even bigger than with printk_deferred() because WARN() in scheduler means that the scheduler is big troubles. Nobody guarantees that the deferred messages will get handled later. Best Regards, Petr _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel