On Fri, Jul 05, 2019 at 11:10:45AM -0400, Brian Foster wrote: > cc linux-xfs > > On Fri, Jul 05, 2019 at 10:33:04PM +0800, Yafang Shao wrote: > > On Fri, Jul 5, 2019 at 7:10 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > On Fri 05-07-19 17:41:44, Yafang Shao wrote: > > > > On Fri, Jul 5, 2019 at 5:09 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > [...] > > > > > Why cannot you move over to v2 and have to stick with v1? > > > > Because the interfaces between cgroup v1 and cgroup v2 are changed too > > > > much, which is unacceptable by our customer. > > > > > > Could you be more specific about obstacles with respect to interfaces > > > please? > > > > > > > Lots of applications will be changed. > > Kubernetes, Docker and some other applications which are using cgroup v1, > > that will be a trouble, because they are not maintained by us. > > > > > > It may take long time to use cgroup v2 in production envrioment, per > > > > my understanding. > > > > BTW, the filesystem on our servers is XFS, but the cgroup v2 > > > > writeback throttle is not supported on XFS by now, that is beyond my > > > > comprehension. > > > > > > Are you sure? I would be surprised if v1 throttling would work while v2 > > > wouldn't. As far as I remember it is v2 writeback throttling which > > > actually works. The only throttling we have for v1 is reclaim based one > > > which is a huge hammer. > > > -- > > > > We did it in cgroup v1 in our kernel. > > But the upstream still don't support it in cgroup v2. > > So my real question is why upstream can't support such an import file system ? > > Do you know which companies besides facebook are using cgroup v2 in > > their product enviroment? > > > > I think the original issue with regard to XFS cgroupv2 writeback > throttling support was that at the time the XFS patch was proposed, > there wasn't any test coverage to prove that the code worked (and the > original author never followed up). That has since been resolved and > Christoph has recently posted a new patch [1], which appears to have > been accepted by the maintainer. I don't think the validation issue has been resolved. i.e. we still don't have regression tests that ensure it keeps working it in future, or that it works correctly in any specific distro setting/configuration. The lack of repeatable QoS validation infrastructure was the reason I never merged support for this in the first place. So while the (simple) patch to support it has been merged now, there's no guarantee that it will work as expected or continue to do so over the long run as nobody upstream or in distro land has a way of validating that it is working correctly. >From that perspective, it is still my opinion that one-off "works for me" testing isn't sufficient validation for a QoS feature that people will use to implement SLAs with $$$ penalities attached to QoS failures.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx