Hello, Haifeng. On Wed, Aug 28, 2024 at 11:32:24AM +0800, Haifeng Xu wrote: ... > The filesystem is ext4(ordered). The meta data can be written out by > writeback, but if there are too many dirty pages, we had to do > checkpoint to write out the meta data in current thread context. > > In this case, the blkg of thread1 has set io.max, so the j_checkpoint_mutex > can't be released and many threads must wait for it. However, the blkg from > buffer page didn' set any io policy. Therefore, for the meta buffer head, > we can associate the bio with blkg from the buffer page instead of current > thread context. > > Signed-off-by: Haifeng Xu <haifeng.xu@xxxxxxxxxx> > --- > fs/buffer.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/fs/buffer.c b/fs/buffer.c > index e55ad471c530..a7889f258d0d 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -2819,6 +2819,17 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh, > if (wbc) { > wbc_init_bio(wbc, bio); > wbc_account_cgroup_owner(wbc, bh->b_page, bh->b_size); > + } else if (buffer_meta(bh)) { > + struct folio *folio; > + struct cgroup_subsys_state *memcg_css, *blkcg_css; > + > + folio = page_folio(bh->b_page); > + memcg_css = mem_cgroup_css_from_folio(folio); > + if (cgroup_subsys_on_dfl(memory_cgrp_subsys) && > + cgroup_subsys_on_dfl(io_cgrp_subsys)) { > + blkcg_css = cgroup_e_css(memcg_css->cgroup, &io_cgrp_subsys); > + bio_associate_blkg_from_css(bio, blkcg_css); I think the right way to do it is marking the bio with REQ_META and implement forced charging in blk-throtl similar to blk-iocost. Thanks. -- tejun