On Tue, Jan 16, 2018 at 7:13 AM, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > On Mon, Jan 15 2018 at 6:10P -0500, > Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > >> On Mon, Jan 15 2018 at 5:51pm -0500, >> Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote: >> >> > On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote: >> > > sysfs write op calls kernfs_fop_write which takes: >> > > of->mutex then kn->count#213 (no idea what that is) >> > > then q->sysfs_lock (via queue_attr_store) >> > > >> > > vs >> > > >> > > blk_unregister_queue takes: >> > > q->sysfs_lock then >> > > kernfs_mutex (via kernfs_remove) >> > > seems lockdep thinks "kernfs_mutex" is "kn->count#213"? >> > > >> > > Feels like lockdep code in fs/kernfs/file.c and fs/kernfs/dir.c is >> > > triggering false positives.. because these seem like different kernfs >> > > locks yet they are reported as "kn->count#213". >> > > >> > > Certainly feeling out of my depth with kernfs's locking though. >> > >> > Hello Mike, >> > >> > I don't think that this is a false positive but rather the following traditional >> > pattern of a potential deadlock involving sysfs attributes: >> > * One context obtains a mutex from inside a sysfs attribute method: >> > queue_attr_store() obtains q->sysfs_lock. >> > * Another context removes a sysfs attribute while holding a mutex: >> > blk_unregister_queue() removes the queue sysfs attributes while holding >> > q->sysfs_lock. >> > >> > This can result in a real deadlock because the code that removes sysfs objects >> > waits until all ongoing attribute callbacks have finished. >> > >> > Since commit 667257e8b298 ("block: properly protect the 'queue' kobj in >> > blk_unregister_queue") modified blk_unregister_queue() such that q->sysfs_lock >> > is held around the kobject_del(&q->kobj) call I think this is a regression >> > introduced by that commit. >> >> Sure, of course it is a regression. >> >> Aside from moving the mutex_unlock(&q->sysfs_lock) above the >> kobject_del(&q->kobj) I don't know how to fix it. >> >> Though, realistically that'd be an adequate fix (given the way the code >> was before). > > Any chance you apply this and re-run your srp_test that triggered the > lockdep splat? > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index 4a6a40ffd78e..c50e08e9bf17 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -952,10 +952,10 @@ void blk_unregister_queue(struct gendisk *disk) > if (q->request_fn || (q->mq_ops && q->elevator)) > elv_unregister_queue(q); > > + mutex_unlock(&q->sysfs_lock); > + > kobject_uevent(&q->kobj, KOBJ_REMOVE); > kobject_del(&q->kobj); > blk_trace_remove_sysfs(disk_to_dev(disk)); > kobject_put(&disk_to_dev(disk)->kobj); > - > - mutex_unlock(&q->sysfs_lock); > } If this patch is required, similar change should be needed in failure path of blk_register_queue() too. -- Ming Lei