On 5/25/23 10:11, Michal Koutný wrote:
On Thu, May 25, 2023 at 12:35:18PM +0800, Ming Lei <ming.lei@xxxxxxxxxx> wrote:
It is less a problem if the cgroup to be destroyed also has other
controllers like memory that will call cgroup_rstat_flush() which will
clean up the reference count. If block is the only controller that uses
rstat, these offline blkcg and blkgs may never be freed leaking more
and more memory over time.
On v2, io implies memory too.
Do you observe the leak on the v2 system too?
(Beware that (not only) dirty pages would pin offlined memcg, so the
actual mem_cgroup_css_release and cgroup_rstat_flush would be further
delayed.)
The memory leak isn't observed on cgroup v2 as the cgroup_rstat_flush()
call by the memory controller will help to flush the extra references
holding back blkcgs. It is only happening with a cgroup v1 configuration
where blkcg is standalone in a cgroup.
To prevent this potential memory leak:
- flush blkcg per-cpu stats list in __blkg_release(), when no new stat
can be added
- add global blkg_stat_lock for covering concurrent parent blkg stat
update
It's bit unfortunate yet another lock is added :-/
IIUC, even Waiman's patch (flush in blkcg_destroy_blkcfs) would need
synchronization for different CPU replicas flushes in
blkcg_iostat_update, right?
Right, the goal is to prevent concurrent call of blkcg_iostat_update()
to the same blkg.
Cheers,
Longman