On Wed, Apr 22, 2020 at 12:27:15AM -0700, Christoph Hellwig wrote: > On Sun, Apr 19, 2020 at 07:45:22PM +0000, Luis Chamberlain wrote: > > +{ > > + struct dentry *dir = NULL; > > + > > + /* This can happen if we have a bug in the lower layers */ > > + dir = debugfs_lookup(kobject_name(q->kobj.parent), blk_debugfs_root); > > + if (dir) { > > + pr_warn("%s: registering request_queue debugfs directory twice is not allowed\n", > > + kobject_name(q->kobj.parent)); > > + dput(dir); > > + return -EALREADY; > > + } > > I don't see why we need this check. If it is valueable enough we > should have a debugfs_create_dir_exclusive or so that retunrns an error > for an exsting directory, instead of reimplementing it in the caller in > a racy way. But I'm not really sure we need it to start with. In short races, and even with synchronous request_queue removal I'm seeing the race is still possible, but that's due to some other races I'm going to chase down now. The easier solution really is to just have a debugfs dir created for each partition if debugfs is enabled, this way the directory will always be there, and the lookups are gone. > > + > > + q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent), > > + blk_debugfs_root); > > + if (!q->debugfs_dir) > > + return -ENOMEM; > > + > > + return 0; > > +} > > + > > +void blk_queue_debugfs_unregister(struct request_queue *q) > > +{ > > + debugfs_remove_recursive(q->debugfs_dir); > > + q->debugfs_dir = NULL; > > +} > > Which to me suggests we can just fold these two into the callers, > with an IS_ENABLED for the creation case given that we check for errors > and the stub will always return an error. Sorry not sure I follow this. > > debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs); > > > > /* > > @@ -856,9 +853,7 @@ void blk_mq_debugfs_register(struct request_queue *q) > > > > void blk_mq_debugfs_unregister(struct request_queue *q) > > { > > - debugfs_remove_recursive(q->debugfs_dir); > > q->sched_debugfs_dir = NULL; > > - q->debugfs_dir = NULL; > > } > > This function is weird - the sched dir gets removed by the > debugfs_remove_recursive, so just leaving a function that clears > a pointer is rather odd. In fact I don't think we need to clear > either sched_debugfs_dir or debugfs_dir anywhere. Indeed. Will clean it up. > > @@ -975,6 +976,14 @@ int blk_register_queue(struct gendisk *disk) > > goto unlock; > > } > > > > + ret = blk_queue_debugfs_register(q); > > + if (ret) { > > + blk_trace_remove_sysfs(dev); > > + kobject_del(&q->kobj); > > + kobject_put(&dev->kobj); > > + goto unlock; > > + } > > + > > Please use a goto label to consolidate the common cleanup code. Sure. > Also I think these generic debugfs changes probably should be separate > to the blktrace changes. I'll try to do that. > > static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts, > > + struct request_queue *q, > > struct blk_trace *bt) > > { > > struct dentry *dir = NULL; > > > > + /* This can only happen if we have a bug on our lower layers */ > > + if (!q->kobj.parent) { > > + pr_warn("%s: request_queue parent is gone\n", buts->name); > > + return NULL; > > + } > > Why is this not simply a WARN_ON_ONCE()? I'll actually remove it and instead fix the race where it happens. > > + if (blk_trace_target_disk(buts->name, kobject_name(q->kobj.parent))) { > > + if (!q->debugfs_dir) { > > + pr_warn("%s: expected request_queue debugfs_dir is not set\n", > > + buts->name); > > + return NULL; > > + } > > + /* > > + * debugfs_lookup() is used to ensure the directory is not > > + * taken from underneath us. We must dput() it later once > > + * done with it within blktrace. > > + */ > > + dir = debugfs_lookup(buts->name, blk_debugfs_root); > > + if (!dir) { > > + pr_warn("%s: expected request_queue debugfs_dir dentry is gone\n", > > + buts->name); > > + return NULL; > > + } > > + /* > > + * This is a reaffirmation that debugfs_lookup() shall always > > + * return the same dentry if it was already set. > > + */ > > + if (dir != q->debugfs_dir) { > > + dput(dir); > > + pr_warn("%s: expected dentry dir != q->debugfs_dir\n", > > + buts->name); > > + return NULL; > > + } > > + bt->backing_dir = q->debugfs_dir; > > + return bt->backing_dir; > > + } > > Even with the gigantic commit log I don't get the point of this > code. It looks rather sketchy and I can't find a rationale for it. Yeah I think this is going to be much easier on the eyes with the revert to synchronous request_queue removal first. Luis