On 10/12/2020 13:18, Johannes Weiner wrote: > Sorry, I'm only now getting back to this. > > On Fri, Dec 04, 2020 at 12:48:49PM +0000, Christoph Hellwig wrote: >> On Thu, Dec 03, 2020 at 05:36:07PM -0500, Johannes Weiner wrote: >>> Correct, it's only interesting for pages under LRU management - page >>> cache and swap pages. It should not matter for direct IO. >>> >>> The VM uses the page flag to tell the difference between cold faults >>> (empty cache startup e.g.), and thrashing pages which are being read >>> back not long after they have been reclaimed. This influences reclaim >>> behavior, but can also indicate a general lack of memory. >> >> I really wonder if we should move setting the flag out of bio_add_page >> and into the writeback code, as it will do the wrong things for >> non-writeback I/O, that is direct I/O or its in-kernel equivalents. > > Good point. When somebody does direct IO reads into a user page that > happens to have the flag set, we misattribute submission delays. > > There is some background discussion from when I first submitted the > patch, which did the annotations on the writeback/page cache side: > > https://lore.kernel.org/lkml/20190722201337.19180-1-hannes@xxxxxxxxxxx/ > > Fragility is a concern, as this is part of the writeback code that is > spread out over several fs-specific implementations, and it's somewhat > easy to get the annotation wrong. > > Some possible options I can think of: > > 1 open-coding the submit_bio() annotations in writeback code, like the original patch > pros: no bio layer involvement at all - no BIO_WORKINGSET flag > cons: lots of copy-paste code & comments > > 2 open-coding if (PageWorkingset()) bio_set_flag(BIO_WORKINGSET) in writeback code > pros: slightly less complex callsite code, eliminates read check in submit_bio() > cons: still somewhat copy-pasty (but the surrounding code is as well) > > 3 adding a bio_add_page_memstall() as per Dave in the original patch thread > pros: minimal churn and self-documenting (may need a better name) > cons: easy to incorrectly use raw bio_add_page() in writeback code > > 4 writeback & direct-io versions for bio_add_page() > pros: hard to misuse > cons: awkward interface/layering > > 5 flag bio itself as writeback or direct-io (BIO_BUFFERED?) > pros: single version of bio_add_page() > cons: easy to miss setting the flag, similar to 3 > > Personally, I'm torn between 2 and 5. What do you think? I was thinking that easier would be inverted 3, i.e. letting add_page with the annotation be and use a special version of it for direct IO. IIRC we only to change bio_iov_iter_get_pages() + its helpers for that. -- Pavel Begunkov