On Tue, Sep 24, 2019 at 12:08:04PM -0700, Linus Torvalds wrote: > On Tue, Sep 24, 2019 at 12:39 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > Stupid question: how is this any different to simply winding down > > our dirty writeback and throttling thresholds like so: > > > > # echo $((100 * 1000 * 1000)) > /proc/sys/vm/dirty_background_bytes > > Our dirty_background stuff is very questionable, but it exists (and > has those insane defaults) because of various legacy reasons. That's not what I was asking about. The context is in the previous lines you didn't quote: > > > > Is the faster speed reproducible? I don't quite understand why this > > > > would be. > > > > > > Writing to disk simply starts earlier. > > > > Stupid question: how is this any different to simply winding down > > our dirty writeback and throttling thresholds like so: i.e. I'm asking about the reasons for the performance differential not asking for an explanation of what writebehind is. If the performance differential really is caused by writeback starting sooner, then winding down dirty_background_bytes should produce exactly the same performance because it will start writeback -much faster-. If it doesn't, then the assertion that the difference is caused by earlier writeout is questionable and the code may not actually be doing what is claimed.... Basically, I'm asking for proof that the explanation is correct. > > to start background writeback when there's 100MB of dirty pages in > > memory, and then: > > > > # echo $((200 * 1000 * 1000)) > /proc/sys/vm/dirty_bytes > > The thing is, that also accounts for dirty shared mmap pages. And it > really will kill some benchmarks that people take very very seriously. Yes, I know that. I'm not suggesting that we do this, [snip] > Anyway, the end result of all this is that we have that > balance_dirty_pages() that is pretty darn complex and I suspect very > few people understand everything that goes on in that function. I'd agree with you there - most of the ground work for the balance_dirty_pages IO throttling feedback loop was all based on concepts I developed to solve dirty page writeback thrashing problems on Irix back in 2003. The code we have in Linux was written by Fenguang Wu with help for a lot of people, but the underlying concepts of delegating IO to dedicated writeback threads that calculate and track page cleaning rates (BDI writeback rates) and then throttling incoming page dirtying rate to the page cleaning rate all came out of my head.... So, much as it may surprise you, I am one of the few people who do actually understand how that whole complex mass of accounting and feedback is supposed to work. :) > Now, whether write-behind really _does_ help that, or whether it's > just yet another tweak and complication, I can't actually say. Neither can I at this point - I lack the data and that's why I was asking if there was a perf difference with the existing limits wound right down. Knowing whether the performance difference is simply a result of starting writeback IO sooner tells me an awful lot about what other behaviour is happening as a result of the changes in this patch. > But I > don't think 'dirty_background_bytes' is really an argument against > write-behind, it's just one knob on the very complex dirty handling we > have. Never said it was - just trying to determine if a one line explanation is true or not. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx