<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, Feb 15, 2022 at 10:37 PM Damien Le Moal > <damien.lemoal@xxxxxxxxxxxxxxxxxx> wrote: > > > > On 2/16/22 13:16, Byungchul Park wrote: > > > [ 2.051040 ] =================================================== > > > [ 2.051406 ] DEPT: Circular dependency has been detected. > > > [ 2.051730 ] 5.17.0-rc1-00014-gcf3441bb2012 #2 Tainted: G W > > > [ 2.051991 ] --------------------------------------------------- > > > [ 2.051991 ] summary > > > [ 2.051991 ] --------------------------------------------------- > > > [ 2.051991 ] *** DEADLOCK *** > > > [ 2.051991 ] > > > [ 2.051991 ] context A > > > [ 2.051991 ] [S] (unknown)(&(&ap->eh_wait_q)->dmap:0) > > > [ 2.051991 ] [W] __raw_spin_lock_irq(&host->lock:0) > > > [ 2.051991 ] [E] event(&(&ap->eh_wait_q)->dmap:0) > > > [ 2.051991 ] > > > [ 2.051991 ] context B > > > [ 2.051991 ] [S] __raw_spin_lock_irqsave(&host->lock:0) > > > [ 2.051991 ] [W] wait(&(&ap->eh_wait_q)->dmap:0) > > > [ 2.051991 ] [E] spin_unlock(&host->lock:0) > > > > Sleeping with a spinlock held would be triggering warnings already, so > > these reports seem bogus to me. > > Yeah, Matthew pointed out the same thing for another use-case, where > it looks like DEPT is looking at the state at the wrong point (not at > the scheduling point, but at prepare_to_sleep()). > > This ata_port_wait() is the exact same pattern, ie we have > > spin_lock_irqsave(ap->lock, flags); > > while (ap->pflags & (ATA_PFLAG_EH_PENDING | ATA_PFLAG_EH_IN_PROGRESS)) { > prepare_to_wait(&ap->eh_wait_q, &wait, TASK_UNINTERRUPTIBLE); > spin_unlock_irqrestore(ap->lock, flags); > schedule(); > > and DEPT has incorrectly taken it to mean that 'ap->lock' is held > during the wait, when it is actually released before actually waiting. > > For the spin-locks, this is all very obvious (because they'd have been > caught long ago by much simpler debug code), but the same > prepare_to_wait -> wait pattern can most definitely happen with > sleeping locks too, so they are all slightly suspect. > > And yes, the detailed reports are hard to read because the locations > are given as "ata_port_wait_eh+0x52/0xc0". Running them through > scripts/decode_stacktrace.sh to turn them into filename and line > numbers - and also sort out inlining - would help a lot. > > Byungchul, could you fix those two issues? Some of your reports may Of couse, that's what I should do. Thanks for your feedback. > well be entirely valid, but the hard-to-read hex offsets and the > knowledge that at least some of them are confused about how > prepare_to_wait -> wait actually works makes the motivation to look at > the details much less.. > > Linus