Re: [for-4.16 PATCH v5 0/4] block/dm: allow DM to defer blk_register_queue() until ready

Mike Snitzer <snitzer@xxxxxxxxxx> · Mon, 15 Jan 2018 18:13:32 -0500

On Mon, Jan 15 2018 at  6:10P -0500,
Mike Snitzer <snitzer@xxxxxxxxxx> wrote:

> On Mon, Jan 15 2018 at  5:51pm -0500,
> Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote:
> 
> > On Mon, 2018-01-15 at 17:15 -0500, Mike Snitzer wrote:
> > > sysfs write op calls kernfs_fop_write which takes:
> > > of->mutex then kn->count#213 (no idea what that is)
> > > then q->sysfs_lock (via queue_attr_store)
> > > 
> > > vs 
> > > 
> > > blk_unregister_queue takes:
> > > q->sysfs_lock then
> > > kernfs_mutex (via kernfs_remove)
> > > seems lockdep thinks "kernfs_mutex" is "kn->count#213"?
> > > 
> > > Feels like lockdep code in fs/kernfs/file.c and fs/kernfs/dir.c is
> > > triggering false positives.. because these seem like different kernfs
> > > locks yet they are reported as "kn->count#213".
> > > 
> > > Certainly feeling out of my depth with kernfs's locking though.
> > 
> > Hello Mike,
> > 
> > I don't think that this is a false positive but rather the following traditional
> > pattern of a potential deadlock involving sysfs attributes:
> > * One context obtains a mutex from inside a sysfs attribute method:
> >   queue_attr_store() obtains q->sysfs_lock.
> > * Another context removes a sysfs attribute while holding a mutex:
> >   blk_unregister_queue() removes the queue sysfs attributes while holding
> >   q->sysfs_lock.
> > 
> > This can result in a real deadlock because the code that removes sysfs objects
> > waits until all ongoing attribute callbacks have finished.
> > 
> > Since commit 667257e8b298 ("block: properly protect the 'queue' kobj in
> > blk_unregister_queue") modified blk_unregister_queue() such that q->sysfs_lock
> > is held around the kobject_del(&q->kobj) call I think this is a regression
> > introduced by that commit.
> 
> Sure, of course it is a regression.
> 
> Aside from moving the mutex_unlock(&q->sysfs_lock) above the
> kobject_del(&q->kobj) I don't know how to fix it.
> 
> Though, realistically that'd be an adequate fix (given the way the code
> was before).

Any chance you apply this and re-run your srp_test that triggered the
lockdep splat?

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 4a6a40ffd78e..c50e08e9bf17 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -952,10 +952,10 @@ void blk_unregister_queue(struct gendisk *disk)
 	if (q->request_fn || (q->mq_ops && q->elevator))
 		elv_unregister_queue(q);
 
+	mutex_unlock(&q->sysfs_lock);
+
 	kobject_uevent(&q->kobj, KOBJ_REMOVE);
 	kobject_del(&q->kobj);
 	blk_trace_remove_sysfs(disk_to_dev(disk));
 	kobject_put(&disk_to_dev(disk)->kobj);
-
-	mutex_unlock(&q->sysfs_lock);
 }

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel