Sorry, I'm only now getting back to this. On Fri, Dec 04, 2020 at 12:48:49PM +0000, Christoph Hellwig wrote: > On Thu, Dec 03, 2020 at 05:36:07PM -0500, Johannes Weiner wrote: > > Correct, it's only interesting for pages under LRU management - page > > cache and swap pages. It should not matter for direct IO. > > > > The VM uses the page flag to tell the difference between cold faults > > (empty cache startup e.g.), and thrashing pages which are being read > > back not long after they have been reclaimed. This influences reclaim > > behavior, but can also indicate a general lack of memory. > > I really wonder if we should move setting the flag out of bio_add_page > and into the writeback code, as it will do the wrong things for > non-writeback I/O, that is direct I/O or its in-kernel equivalents. Good point. When somebody does direct IO reads into a user page that happens to have the flag set, we misattribute submission delays. There is some background discussion from when I first submitted the patch, which did the annotations on the writeback/page cache side: https://lore.kernel.org/lkml/20190722201337.19180-1-hannes@xxxxxxxxxxx/ Fragility is a concern, as this is part of the writeback code that is spread out over several fs-specific implementations, and it's somewhat easy to get the annotation wrong. Some possible options I can think of: 1 open-coding the submit_bio() annotations in writeback code, like the original patch pros: no bio layer involvement at all - no BIO_WORKINGSET flag cons: lots of copy-paste code & comments 2 open-coding if (PageWorkingset()) bio_set_flag(BIO_WORKINGSET) in writeback code pros: slightly less complex callsite code, eliminates read check in submit_bio() cons: still somewhat copy-pasty (but the surrounding code is as well) 3 adding a bio_add_page_memstall() as per Dave in the original patch thread pros: minimal churn and self-documenting (may need a better name) cons: easy to incorrectly use raw bio_add_page() in writeback code 4 writeback & direct-io versions for bio_add_page() pros: hard to misuse cons: awkward interface/layering 5 flag bio itself as writeback or direct-io (BIO_BUFFERED?) pros: single version of bio_add_page() cons: easy to miss setting the flag, similar to 3 Personally, I'm torn between 2 and 5. What do you think?