On Thu, May 25, 2023 at 04:11:34PM +0200, Michal Koutný wrote: > On Thu, May 25, 2023 at 12:35:18PM +0800, Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > It is less a problem if the cgroup to be destroyed also has other > > controllers like memory that will call cgroup_rstat_flush() which will > > clean up the reference count. If block is the only controller that uses > > rstat, these offline blkcg and blkgs may never be freed leaking more > > and more memory over time. > > On v2, io implies memory too. > Do you observe the leak on the v2 system too? blkg leak is only observed on v1. > > (Beware that (not only) dirty pages would pin offlined memcg, so the > actual mem_cgroup_css_release and cgroup_rstat_flush would be further > delayed.) > > > To prevent this potential memory leak: > > > > - flush blkcg per-cpu stats list in __blkg_release(), when no new stat > > can be added > > > > - add global blkg_stat_lock for covering concurrent parent blkg stat > > update > > It's bit unfortunate yet another lock is added :-/ Cause it should be the simplest one for backport, :-) > > IIUC, even Waiman's patch (flush in blkcg_destroy_blkcfs) would need > synchronization for different CPU replicas flushes in > blkcg_iostat_update, right? > > > - don't grab bio->bi_blkg reference when adding the stats into blkcg's > > per-cpu stat list since all stats are guaranteed to be consumed before > > releasing blkg instance, and grabbing blkg reference for stats was the > > most fragile part of original patch > > > At one moment, the lhead -> blkcg_gq reference seemed alright to me and But bio->bi_blkg has grabbed one reference in blk_cgroup_bio_start already. > consequently blkcg_gq -> blkcg is the one that looks reversed (forming IMO, this one is correct, cause blkcg_gq depends on blkcg. > the cycle). But changing its direction would be much more fundamental > change, it'd need also kind of blkcg_gq reparenting -- similarly to > memcg. Thanks, Ming