On 5/23/23 22:37, Ming Lei wrote:
Hi Yosry,
On Tue, May 23, 2023 at 07:06:38PM -0700, Yosry Ahmed wrote:
Hi Ming,
On Tue, May 23, 2023 at 6:21 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
As noted by Michal, the blkg_iostat_set's in the lockless list
hold reference to blkg's to protect against their removal. Those
blkg's hold reference to blkcg. When a cgroup is being destroyed,
cgroup_rstat_flush() is only called at css_release_work_fn() which
is called when the blkcg reference count reaches 0. This circular
dependency will prevent blkcg and some blkgs from being freed after
they are made offline.
I am not at all familiar with blkcg, but does calling
cgroup_rstat_flush() in offline_css() fix the problem?
Except for offline, this list needs to be flushed after the associated disk
is deleted.
or can items be
added to the lockless list(s) after the blkcg is offlined?
Yeah.
percpu_ref_*get(&blkg->refcnt) still can succeed after the percpu refcnt
is killed in blkg_destroy() which is called from both offline css and
removing disk.
As suggested by Tejun, we can use percpu_ref_tryget(&blkg->refcnt) to
make sure that we can only take a reference when the blkg is online. I
think it is a bit safer to take a percpu refcnt to avoid use after free.
My other concern about your patch is that the per cpu list iterations
will be done multiple times when a blkcg is destroyed if many blkgs are
attached to the blkcg. I still prefer to do it once in
blkcg_destroy_blkgs(). I am going to post an updated version tomorrow
after some more testings.
Cheers,
Longman