On 2/10/21 1:07 AM, Sedat Dilek wrote: > On Tue, Feb 9, 2021 at 10:25 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 2/9/21 12:55 PM, Andrew Morton wrote: >>> On Mon, 8 Feb 2021 19:30:05 -0700 Jens Axboe <axboe@xxxxxxxxx> wrote: >>> >>>> Hi, >>>> >>>> For v1, see: >>>> >>>> https://lore.kernel.org/linux-fsdevel/20210208221829.17247-1-axboe@xxxxxxxxx/ >>>> >>>> tldr; don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache >>>> entries for the given range. This causes unnecessary work from the callers >>>> side, when the IO could have been issued totally fine without blocking on >>>> writeback when there is none. >>>> >>> >>> Seems a good idea. Obviously we'll do more work in the case where some >>> writeback needs doing, but we'll be doing synchronous writeout in that >>> case anyway so who cares. >> >> Right, I think that'll be a round two on top of this, so we can make the >> write side happier too. That's a bit more involved... >> >>> Please remind me what prevents pages from becoming dirty during or >>> immediately after the filemap_range_needs_writeback() check? Perhaps >>> filemap_range_needs_writeback() could have a comment explaining what it >>> is that keeps its return value true after it has returned it! >> >> It's inherently racy, just like it is now. There's really no difference >> there, and I don't think there's a way to close that. Even if you >> modified filemap_write_and_wait_range() to be non-block friendly, >> there's nothing stopping anyone from adding dirty page cache right after >> that call. >> > > Jens, do you have some numbers before and after your patchset is applied? I don't, the load was pretty light for the test case - it was just doing 33-34K of O_DIRECT 4k random reads in a pretty small range of the device. When you end up having page cache in that range, that means you end up punting a LOT of requests to the async worker. So it wasn't as much a performance win for this particular case, but an efficiency win. You get rid of a worker using 40% CPU, and reduce the latencies. > And kindly a test "profile" for FIO :-)? To reproduce this, have a small range dio rand reads and then have something else that does a few buffered reads from the same range. -- Jens Axboe