On 4/21/22 10:34, Ming Lei wrote:
q->debugfs_dir is used by blk-mq debugfs and blktrace. The dentry is
created when adding disk, and removed when releasing request queue.
There is small window between releasing disk and releasing request
queue, and during the period, one disk with same name may be created
and added, so debugfs_create_dir() may complain with "Directory XXXXX
with parent 'block' already present!"
Fixes the issue by moving debugfs_create_dir() into blk_alloc_queue(),
and the dir name is named with q->id from beginning, and switched to
disk name when adding disk, and finally changed to q->id in disk_release().
Reported-by: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: yukuai (C) <yukuai3@xxxxxxxxxx>
Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
block/blk-core.c | 4 ++++
block/blk-sysfs.c | 4 ++--
block/genhd.c | 8 ++++++++
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index f305cb66c72a..245ec664753d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -438,6 +438,7 @@ struct request_queue *blk_alloc_queue(int node_id, bool alloc_srcu)
{
struct request_queue *q;
int ret;
+ char q_name[16];
q = kmem_cache_alloc_node(blk_get_queue_kmem_cache(alloc_srcu),
GFP_KERNEL | __GFP_ZERO, node_id);
@@ -495,6 +496,9 @@ struct request_queue *blk_alloc_queue(int node_id, bool alloc_srcu)
blk_set_default_limits(&q->limits);
q->nr_requests = BLKDEV_DEFAULT_RQ;
+ sprintf(q_name, "%d", q->id);
+ q->debugfs_dir = debugfs_create_dir(q_name, blk_debugfs_root);
+
return q;
fail_stats:
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 88bd41d4cb59..1f986c20a07b 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -837,8 +837,8 @@ int blk_register_queue(struct gendisk *disk)
}
mutex_lock(&q->debugfs_mutex);
- q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
- blk_debugfs_root);
+ q->debugfs_dir = debugfs_rename(blk_debugfs_root, q->debugfs_dir,
+ blk_debugfs_root, kobject_name(q->kobj.parent));
mutex_unlock(&q->debugfs_mutex);
if (queue_is_mq(q)) {
diff --git a/block/genhd.c b/block/genhd.c
index 36532b931841..08895f9f7087 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -25,6 +25,7 @@
#include <linux/pm_runtime.h>
#include <linux/badblocks.h>
#include <linux/part_stat.h>
+#include <linux/debugfs.h>
#include "blk-throttle.h"
#include "blk.h"
@@ -1160,6 +1161,7 @@ static void disk_release_mq(struct request_queue *q)
static void disk_release(struct device *dev)
{
struct gendisk *disk = dev_to_disk(dev);
+ char q_name[16];
might_sleep();
WARN_ON_ONCE(disk_live(disk));
@@ -1173,6 +1175,12 @@ static void disk_release(struct device *dev)
kfree(disk->random);
xa_destroy(&disk->part_tbl);
+ mutex_lock(&disk->queue->debugfs_mutex);
+ sprintf(q_name, "%d", disk->queue->id);
+ disk->queue->debugfs_dir = debugfs_rename(blk_debugfs_root,
+ disk->queue->debugfs_dir, blk_debugfs_root, q_name);
+ mutex_unlock(&disk->queue->debugfs_mutex);
+
disk->queue->disk = NULL;
blk_put_queue(disk->queue);
I don't think this is the right approach.
From my POV the underlying reason is an imbalance between
debugfs_create_dir() (which happens in blk_register_queue()) and
debugfs_remove_dir() (which happens in blk_release_queue())
So there is a small race window between blk_unregister_queue() and
blk_release_queue(), during which the queue might be re-registered and
then traipses over the (still-existant) queue.
So we should rather move the call to debugfs_remove_dir() into
blk_unregister_queue() to have them both symmetric.
Basically the patch '[PATCH RESEND] blk-mq: fix possible creation
failure for 'debugfs_dir'' from yukuai ...
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer