On 12/12/22 17:24, Tejun Heo wrote:
On Sun, Dec 11, 2022 at 05:20:58PM -0500, Waiman Long wrote:
As noted by Michal, the blkg_iostat_set's in the lockless list
hold reference to blkg's to protect against their removal. Those
blkg's hold reference to blkcg. When a cgroup is being destroyed,
cgroup_rstat_flush() is only called at css_release_work_fn() which is
called when the blkcg reference count reaches 0. This circular dependency
will prevent blkcg from being freed until some other events cause
cgroup_rstat_flush() to be called to flush out the pending blkcg stats.
To prevent this delayed blkcg removal, add a new cgroup_rstat_css_flush()
function to flush stats for a given css and cpu and call it at the blkgs
destruction path, blkcg_destroy_blkgs(), whenever there are still some
pending stats to be flushed. This will ensure that blkcg reference
count can reach 0 ASAP.
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Acked-by: Tejun Heo <tj@xxxxxxxxxx>
But a nit below
+ /*
+ * Flush all the non-empty percpu lockless lists.
+ */
Can you please explain the deadlock that's being avoided in the above
comment? ie. it should say why this flush is necessary.
Sure. I will expand the comment to elaborate a bit more.
Cheers,
Longman
+ for_each_possible_cpu(cpu) {
+ struct llist_head *lhead = per_cpu_ptr(blkcg->lhead, cpu);
+
+ if (!llist_empty(lhead))
+ cgroup_rstat_css_cpu_flush(&blkcg->css, cpu);
+ }