Re: [PATCH V2] xfs: implement cgroup writeback support

Brian Foster <bfoster@xxxxxxxxxx> · Mon, 26 Mar 2018 12:28:31 -0400

On Mon, Mar 26, 2018 at 08:59:04AM +1100, Dave Chinner wrote:
> On Fri, Mar 23, 2018 at 10:24:03PM +0800, 张本龙 wrote:
> > Hi Shaohua and XFS,
> > 
> > May I ask how are we gonna handle REQ_META issued from XFS? As you
> > mentioned about charging to root cgroup (also in an earlier email
> > discussion), and seems the 4.16.0-rc6 code is not handling it
> > separately.
> > 
> > In our case to support XFS cgroup writeback control, which was ported
> > and slightly adapted to 3.10.0, ignoring xfs log bios resulted in
> > trouble. Threads from throttled docker might submit_bio in following
> > path by its own identity, this docker blkcg accumulated large amounts
> > of data (e.g., 20GB), thus such log gets blocked.
> 
> And thus displaying the reason why I originally refused to merge
> this code until regression tests were added to fstests to exercise
> these sorts of issues. This stuff adds new internal filesystem IO
> ordering constraints, so we need tests that exercise it and ensure
> we don't accidentally break it in future.
> 

Hmm, but if the user issues fsync from the throttled cgroup then won't
that throttling occur today, regardless of cgroup aware writeback? My
understanding is that cgawb just accurately accounts writeback I/Os to
the owner of the cached pages. IOW, if the buffered writer and fsync
call are in the same throttled cgroup, then the throttling works just as
it would with cgawb and the writer being in a throttled cgroup.

So ISTM that this is an independent problem. What am I missing?

Shaohua,

Do you have a reference to the older metadata related patch mentioned in
the commit log that presumably addressed this?

Brian

> 
> > Not familiar with XFS, but seems log bios are partially stuck in
> > throttled cgroups, leaving other innocent groups waiting for
> > completion. To cope with this we bypassed REQ_META log bios in
> > blk_throtl_bio().
> 
> Yup, the log is global and should not be throttled. Metadata is less
> obvious as to what the correct thing to do is, because writes are
> always done from internal kernel threads but reads are done from a
> mix of kernel threads, user cgroup contexts and user data IO
> completions. Hence there are situations where metadata reads may
> need to be throttled because they are cgroup context only (e.g.
> filesystem directory traversal) but others where reads should not
> be throttled because they have global context (e.g. inside a
> transaction when other buffers are already locked).
> 
> Getting this right and keeping it working requires regression tests
> that get run on every release, whether it be upstream or distro
> kernels, and that means we need tests in fstests to cover cgroup IO
> control....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html