On Sat, Jun 10, 2023 at 07:42:49AM +0800, Ming Lei wrote: > As noted by Michal, the blkg_iostat_set's in the lockless list hold > reference to blkg's to protect against their removal. Those blkg's > hold reference to blkcg. When a cgroup is being destroyed, > cgroup_rstat_flush() is only called at css_release_work_fn() which > is called when the blkcg reference count reaches 0. This circular > dependency will prevent blkcg and some blkgs from being freed after > they are made offline. > > It is less a problem if the cgroup to be destroyed also has other > controllers like memory that will call cgroup_rstat_flush() which will > clean up the reference count. If block is the only controller that uses > rstat, these offline blkcg and blkgs may never be freed leaking more > and more memory over time. > > To prevent this potential memory leak: > > - flush blkcg per-cpu stats list in __blkg_release(), when no new stat > can be added > > - add global blkg_stat_lock for covering concurrent parent blkg stat > update > > - don't grab bio->bi_blkg reference when adding the stats into blkcg's > per-cpu stat list since all stats are guaranteed to be consumed before > releasing blkg instance, and grabbing blkg reference for stats was the > most fragile part of original patch > > Based on Waiman's patch: > > https://lore.kernel.org/linux-block/20221215033132.230023-3-longman@xxxxxxxxxx/ > > Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") > Cc: stable@xxxxxxxxxxxxxxx > Reported-by: Jay Shin <jaeshin@xxxxxxxxxx> > Acked-by: Tejun Heo <tj@xxxxxxxxxx> > Cc: Waiman Long <longman@xxxxxxxxxx> > Cc: mkoutny@xxxxxxxx > Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > --- > V4: > - add ack tag Hi Jens, Waiman agrees with this approach too[1], can you make it in v6.4 if you are fine? [1] https://lore.kernel.org/linux-block/64f20e27-0927-334d-5414-9bb81d639cec@xxxxxxxxxx/ Thanks, Ming