在 2022/04/20 9:46, Williams, Dan J 写道:
On Fri, 2022-03-18 at 21:01 +0800, Ming Lei wrote:
In the whole lifetime of blkcg_gq instance, ->q will be referred, such
as, ->pd_free_fn() is called in blkg_free, and throtl_pd_free() still
may touch the request queue via &tg->service_queue.pending_timer which
is handled by throtl_pending_timer_fn(), so it is reasonable to grab
request queue's refcnt by blkcg_gq instance.
Previously blkcg_exit_queue() is called from blk_release_queue, and it
is hard to avoid the use-after-free. But recently commit 1059699f87eb ("block:
move blkcg initialization/destroy into disk allocation/release handler")
is merged to for-5.18/block, it becomes simple to fix the issue by simply
grabbing request queue's refcnt.
This patch, upstream as commit 0a9a25ca7843 ("block: let blkcg_gq grab
request queue's refcnt") causes the nvdimm unit tests to spam messages
like:
[ 51.439133] debugfs: Directory 'pmem2' with parent 'block' already present!
[ 52.095679] debugfs: Directory 'pmem3' with parent 'block' already present!
[ 52.505613] block device autoloading is deprecated and will be removed.
[ 52.791693] debugfs: Directory 'pmem2' with parent 'block' already present!
[ 53.240314] debugfs: Directory 'pmem3' with parent 'block' already present!
[ 53.373472] debugfs: Directory 'pmem3' with parent 'block' already present!
[ 53.688876] nd_pmem btt2.0: No existing arenas
[ 53.773097] debugfs: Directory 'pmem2s' with parent 'block' already present!
[ 53.884493] debugfs: Directory 'pmem2s' with parent 'block' already present!
[ 54.042946] debugfs: Directory 'pmem2s' with parent 'block' already present!
[ 54.195954] debugfs: Directory 'pmem2s' with parent 'block' already present!
[ 54.383989] debugfs: Directory 'pmem2' with parent 'block' already present!
[ 54.577206] nd_pmem btt3.0: No existing arenas
Hi,
I saw the same warning in our test, and I posted a patch to fix the
problem:
https://patchwork.kernel.org/project/linux-block/patch/20220415035607.1836495-1-yukuai3@xxxxxxxxxx/
The root cause is not relate to the above commit, see details in
the commit message.
Thanks,
Kuai
...on a test doing block device setup/teardown.
It is still reproducible in v5.18-rc3 and I bisected manually applying
commit d578c770c852 ("block: avoid calling blkg_free() in atomic
context") to avoid the other identified regression with this change.
I'll take a deeper look tomorrow. It could be this is triggering a
latent bug in the pmem driver, but wanted to send this in case someone
saw an immediate ordering problem caused by this change besides the one
fixed by d578c770c852.
Reported-by: Christoph Hellwig <hch@xxxxxx>
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
block/blk-cgroup.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index fa063c6c0338..d53b0d69dd73 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -82,6 +82,8 @@ static void blkg_free(struct blkcg_gq *blkg)
if (blkg->pd[i])
blkcg_policy[i]->pd_free_fn(blkg->pd[i]);
+ if (blkg->q)
+ blk_put_queue(blkg->q);
free_percpu(blkg->iostat_cpu);
percpu_ref_exit(&blkg->refcnt);
kfree(blkg);
@@ -167,6 +169,9 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q,
if (!blkg->iostat_cpu)
goto err_free;
+ if (!blk_get_queue(q))
+ goto err_free;
+
blkg->q = q;
INIT_LIST_HEAD(&blkg->q_node);
spin_lock_init(&blkg->async_bio_lock);