On 2023/02/05 1:27, Alan Stern wrote: > On Sun, Feb 05, 2023 at 01:12:12AM +0900, Tetsuo Handa wrote: >> On 2023/02/05 0:34, Alan Stern wrote: >>>> A few of examples: >>>> >>>> https://syzkaller.appspot.com/bug?extid=2d6ac90723742279e101 >>> >>> It's hard to figure out what's wrong from looking at the syzbot report. >>> What makes you think it is connected with dev->mutex? >>> >>> At first glance, it seems that the ath6kl driver is trying to flush a >>> workqueue while holding a lock or mutex that is needed by one of the >>> jobs in the workqueue. That's obviously never going to work, no matter >>> what sort of lockdep validation gets used. >> >> That lock is exactly dev->mutex where lockdep validation is disabled. >> If lockdep validation on dev->mutex were not disabled, we can catch >> possibility of deadlock before khungtaskd reports real deadlock as hung. >> >> Lockdep validation on dev->mutex being disabled is really annoying, and >> I want to make lockdep validation on dev->mutex enabled; that is the >> "drivers/core: Remove lockdep_set_novalidate_class() usage" patch. > >> Even if it is always safe to acquire a child device's lock while holding >> the parent's lock, disabling lockdep checks completely on device's lock is >> not safe. > > I understand the problem you want to solve, and I understand that it > can be frustrating. However, I do not believe you will be able to > solve this problem. That is a declaration that driver developers are allowed to take it for granted that driver callback functions can behave as if dev->mutex is not held. Some developers test their changes with lockdep enabled, and believe that their changes are correct because lockdep did not complain. https://syzkaller.appspot.com/bug?extid=9ef743bba3a17c756174 is an example. We should somehow update driver core code to make it possible to keep lockdep checks enabled on dev->mutex.