On Fri, 18 Jun 2021 10:31:35 +0200 Michael Stapelberg wrote: >Hey Miklos > >Thanks for taking a look! > >On Fri, 18 Jun 2021 at 10:18, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: >> >> On Thu, 17 Jun 2021 at 11:53, Michael Stapelberg >> <stapelberg+linux@xxxxxxxxxx> wrote: >> > >> > These new knobs allow e.g. FUSE file systems to guide kernel memory >> > writeback bandwidth throttling. >> > >> > Background: >> > >> > When using mmap(2) to read/write files, the page-writeback code tries t= >o >> > measure how quick file system backing devices (BDI) are able to write d= >ata, >> > so that it can throttle processes accordingly. >> > >> > Unfortunately, certain usage patterns, such as linkers (tested with GCC= >, >> > but also the Go linker) seem to hit an unfortunate corner case when wri= >ting >> > their large executable output files: the kernel only ever measures >> > the (non-representative) rising slope of the starting bulk write, but t= >he >> > whole file write is already over before the kernel could possibly measu= >re >> > the representative steady-state. >> > >> > As a consequence, with each program invocation hitting this corner case= >, >> > the FUSE write bandwidth steadily sinks in a downward spiral, until it >> > eventually reaches 0 (!). This results in the kernel heavily throttling >> > page dirtying in programs trying to write to FUSE, which in turn manife= >sts >> > itself in slow or even entirely stalled linker processes. >> > >> > Change: >> > >> > This commit adds 2 knobs which allow avoiding this situation entirely o= >n a >> > per-file-system basis by restricting the minimum/maximum bandwidth. >> >> >> This looks like a bug in the dirty throttling heuristics, that may be >> effecting multiple fuse based filesystems. >> >> Ideally the solution should be a fix to those heuristics, not adding more= > knobs. > > >Agreed. +1 > >> >> >> Is there a fundamental reason why that can't be done? Maybe the >> heuristics need to detect the fact that steady state has not been >> reached, and not modify the bandwidth in that case, or something along >> those lines. > >Maybe, but I don=E2=80=99t have the expertise, motivation or time to >investigate this any further, let alone commit to get it done. >During our previous discussion I got the impression that nobody else >had any cycles for this either: >https://lore.kernel.org/linux-fsdevel/CANnVG6n=3DySfe1gOr=3D0ituQidp56idGAR= >DKHzP0hv=3DERedeMrMA@xxxxxxxxxxxxxx/ Its timestamp is Mon, 9 Mar 2020 16:11:41 +0100 > >Have you had a look at the China LSF report at >http://bardofschool.blogspot.com/2011/? >The author of the heuristic has spent significant effort and time >coming up with what we currently have in the kernel: > >""" >Fengguang said he draw more than 10K performance graphs and read even >more in the past year. >""" > >This implies that making changes to the heuristic will not be a quick fix. The 2019 attempt [01] IIRC was trying to cut the heuristics. > >I think adding these limit knobs could be useful regardless of the >specific heuristic behavior. >The knobs are certainly easy to understand, safe to introduce (no regressio= >ns), >and can be used to fix the issue at hand as well as other issues (if >any, now or in the future). > >Thanks >Best regards >Michael [01] https://lore.kernel.org/lkml/20191118082559.GJ6910@shao2-debian/