Re: [PATCH v8 3/3] blk-cgroup: Optimize blkcg_rstat_flush()

Hillf Danton <hdanton@xxxxxxxx> · Thu, 6 Oct 2022 18:11:41 +0800

On 4 Oct 2022 11:17:48 -0400 Waiman Long <longman@xxxxxxxxxx>
> For a system with many CPUs and block devices, the time to do
> blkcg_rstat_flush() from cgroup_rstat_flush() can be rather long. It
> can be especially problematic as interrupt is disabled during the flush.
> It was reported that it might take seconds to complete in some extreme
> cases leading to hard lockup messages.
> 
> As it is likely that not all the percpu blkg_iostat_set's has been
> updated since the last flush, those stale blkg_iostat_set's don't need
> to be flushed in this case. This patch optimizes blkcg_rstat_flush()
> by keeping a lockless list of recently updated blkg_iostat_set's in a
> newly added percpu blkcg->lhead pointer.
> 
> The blkg_iostat_set is added to a sentinel lockless list on the update
> side in blk_cgroup_bio_start(). It is removed from the sentinel lockless
> list when flushed in blkcg_rstat_flush(). Due to racing, it is possible
> that blk_iostat_set's in the lockless list may have no new IO stats to
> be flushed, but that is OK.

So it is likely that another flag, updated when bis is added to/deleted
from llist, can cut 1/3 off without raising the risk of getting your patch
over complicated.

>  
>  struct blkg_iostat_set {
>  	struct u64_stats_sync		sync;
> +	struct llist_node		lnode;
> +	struct blkcg_gq		       *blkg;

+	atomic_t			queued;

>  	struct blkg_iostat		cur;
>  	struct blkg_iostat		last;
>  };