> Il giorno 18 gen 2019, alle ore 11:31, Andrea Righi <righi.andrea@xxxxxxxxx> ha scritto: > > This is a redesign of my old cgroup-io-throttle controller: > https://lwn.net/Articles/330531/ > > I'm resuming this old patch to point out a problem that I think is still > not solved completely. > > = Problem = > > The io.max controller works really well at limiting synchronous I/O > (READs), but a lot of I/O requests are initiated outside the context of > the process that is ultimately responsible for its creation (e.g., > WRITEs). > > Throttling at the block layer in some cases is too late and we may end > up slowing down processes that are not responsible for the I/O that > is being processed at that level. > > = Proposed solution = > > The main idea of this controller is to split I/O measurement and I/O > throttling: I/O is measured at the block layer for READS, at page cache > (dirty pages) for WRITEs, and processes are limited while they're > generating I/O at the VFS level, based on the measured I/O. > Hi Andrea, what the about the case where two processes are dirtying the same pages? Which will be charged? Thanks, Paolo > = Example = > > Here's a trivial example: create 2 cgroups, set an io.max limit of > 10MB/s, run a write-intensive workload on both and after a while, from a > root cgroup, run "sync". > > # cat /proc/self/cgroup > 0::/cg1 > # fio --rw=write --bs=1M --size=32M --numjobs=16 --name=seeker --time_based --runtime=30 > > # cat /proc/self/cgroup > 0::/cg2 > # fio --rw=write --bs=1M --size=32M --numjobs=16 --name=seeker --time_based --runtime=30 > > - io.max controller: > > # echo "259:0 rbps=10485760 wbps=10485760" > /sys/fs/cgroup/unified/cg1/io.max > # echo "259:0 rbps=10485760 wbps=10485760" > /sys/fs/cgroup/unified/cg2/io.max > > # cat /proc/self/cgroup > 0::/ > # time sync > > real 0m51,241s > user 0m0,000s > sys 0m0,113s > > Ideally "sync" should complete almost immediately, because the root > cgroup is unlimited and it's not doing any I/O at all, but instead it's > blocked for more than 50 sec with io.max, because the writeback is > throttled to satisfy the io.max limits. > > - fsio controller: > > # echo "259:0 10 10" > /sys/fs/cgroup/unified/cg1/fsio.max_mbs > # echo "259:0 10 10" > /sys/fs/cgroup/unified/cg2/fsio.max_mbs > > [you can find details about the syntax in the documentation patch] > > # cat /proc/self/cgroup > 0::/ > # time sync > > real 0m0,146s > user 0m0,003s > sys 0m0,001s > > = Questions = > > Q: Do we need another controller? > A: Probably no, I think it would be better to integrate this policy (or > something similar) in the current blkio controller, this is just to > highlight the problem and get some ideas on how to address it. > > Q: What about proportional limits / latency? > A: It should be trivial to add latency-based limits if we integrate this in the > current I/O controller. About proportional limits (weights), they're > strictly related to I/O scheduling and since this controller doesn't touch > I/O dispatching policies it's not trivial to implement proportional limits > (bandwidth limiting is definitely more straightforward). > > Q: Applying delays at the VFS layer doesn't prevent I/O spikes during > writeback, right? > A: Correct, the tradeoff here is to tolerate I/O bursts during writeback to > avoid priority inversion problems in the system. > > Andrea Righi (3): > fsio-throttle: documentation > fsio-throttle: controller infrastructure > fsio-throttle: instrumentation > > Documentation/cgroup-v1/fsio-throttle.txt | 142 +++++++++ > block/blk-core.c | 10 + > include/linux/cgroup_subsys.h | 4 + > include/linux/fsio-throttle.h | 43 +++ > include/linux/writeback.h | 7 +- > init/Kconfig | 11 + > kernel/cgroup/Makefile | 1 + > kernel/cgroup/fsio-throttle.c | 501 ++++++++++++++++++++++++++++++ > mm/filemap.c | 20 +- > mm/page-writeback.c | 14 +- > 10 files changed, 749 insertions(+), 4 deletions(-) >