On 3/24/21 4:19 PM, Andrew Morton wrote: > On Wed, 24 Mar 2021 16:13:48 -0600 Jens Axboe <axboe@xxxxxxxxx> wrote: > >> On 3/24/21 4:02 PM, Andrew Morton wrote: >>> On Wed, 24 Mar 2021 13:48:08 -0600 Jens Axboe <axboe@xxxxxxxxx> wrote: >>> >>>> On Tue, Mar 2, 2021 at 4:43 PM <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: >>>>> >>>>> >>>>> The patch titled >>>>> Subject: mm: provide filemap_range_needs_writeback() helper >>>>> has been added to the -mm tree. Its filename is >>>>> mm-provide-filemap_range_needs_writeback-helper.patch >>>> >>>> Are you still planning on flushing this out for 5.12?? >>> >>> Oh. No, I wasn't planning on that. I saw nothing which made me think >>> it was that urgent. >> >> Ehm ok, I think that was the plan though when we originally talked about >> this series. At least that was what was in my original emails on this >> topic :-) >> >>> What is the justification? How much performance benefit are we talking >>> here? >> >> Well it's pretty huge, since if we return "nope can't do this >> non-blocking" it'll mean that io_uring has to bump the operation to an >> async threads. So CPU cost goes way up, and latencies does too. >> >> Problem is, we're now at -rc4, and it was my understanding that we'd get >> this in for 5.12, but obviously way sooner than this. But I kind of lost >> track when it went into your tree, until I started thinking about it >> earlier today. While it's not the end of the world to wait for 5.13, >> though the current situation does suck a lot for certain workloads. > > I'd be OK with sending it to Linus now, but we really should put some > numbers behind "suck a lot" to justify it. And a -stable backport > might be justified if the benefit is large enough, and if "certain > workloads" are common enough. > > But with what have here, I and everyone else who considers these > patches is going in blind! The backstory is that an internal workload complained because it was using too much CPU, and when I took a look, we had a lot of io_uring workers going to town. For an async buffered read like workload, I am normally expecting _zero_ offloads to a worker thread, but this one had tons of them. I'd drop caches and things would look good again, but then a minute later we'd regress back to using workers. Turns out that every minute something was reading parts of the device, which would add page cache for that inode. I put patches like these in for our kernel, and the problem was solved. Obviously that's not a great story since there are no hard numbers there, and I'd have to generate those for you! Which I surely can. While the original premise of the patches was fixing excessive CPU usage, the general observation is that outside of just using a lot more CPU, it'll also cause worse latencies as offload to a thread always adds latency. So let me know if you want me to generate some numbers. They will be artificial, but at least it'll be something outside of a story. -- Jens Axboe