On Wed, Feb 01, 2023 at 12:29:41PM +0800, Ming Lei wrote: > On Wed, Feb 01, 2023 at 09:56:00AM +0800, Ming Lei wrote: > > On Tue, Jan 31, 2023 at 09:31:36AM -0800, Bart Van Assche wrote: > > > On 1/30/23 17:52, Ming Lei wrote: > > > > Hi Bart, > > > > > > > > On Mon, Jan 30, 2023 at 03:22:57PM -0800, Bart Van Assche wrote: > > > > > Since commit 0a9a25ca7843 ("block: let blkcg_gq grab request queue's > > > > > refcnt") for many request queues the reference count drops to 1 when > > > > > the request queue is destroyed instead of to 0. In other words, the > > > > > request queue is leaked. Fix this by reverting that commit. > > > > > > > > When/where you observe that the reference count drops to 1 instead of 0? > > > > > > > > Do you have kmem leak log? > > > > > > > > Probably, the last drop is in blkg_free_workfn(). > > > > > > Hi Ming, > > > > > > The reference count leak was discovered while I was testing my patch series > > > that adds support for sub-page limits (https://lore.kernel.org/linux-block/20230130212656.876311-1-bvanassche@xxxxxxx/T/#t). > > > The second patch in that series adds a counter that tracks the number of > > > queues that need support for limits below the page size > > > (sub_page_limit_queues). I noticed that without this patch that counter > > > increases but never decreases. With this patch applied, that counter drops > > > back to zero after having run a test that needs support for sub-page limits. > > > > I can reproduce the issue by scsi_debug now, but blkg_release() isn't called, > > so looks like one blkcg_gq lifetime issue since blkcg_exit_disk() is really > > run. > > The problem is caused by 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()"). > > This commit will hold blkg instance until blkcg_rstat_flush() is called, > and which may be delayed to css_release_work_fn(). The following patch can address the blkg leak issue: diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index cb110fc51940..78f855c34746 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -2034,6 +2034,10 @@ void blk_cgroup_bio_start(struct bio *bio) struct blkg_iostat_set *bis; unsigned long flags; + /* Root-level stats are sourced from system-wide IO stats */ + if (!cgroup_parent(blkcg->css.cgroup)) + return; + cpu = get_cpu(); bis = per_cpu_ptr(bio->bi_blkg->iostat_cpu, cpu); flags = u64_stats_update_begin_irqsave(&bis->sync); Thanks, Ming