On Tue, Dec 01, 2020 at 01:32:26PM +0000, Christoph Hellwig wrote: > On Tue, Dec 01, 2020 at 01:17:49PM +0000, Pavel Begunkov wrote: > > I was thinking about memcpy bvec instead of iterating as a first step, > > and then try to reuse passed in bvec. > > > > A thing that doesn't play nice with that is setting BIO_WORKINGSET in > > __bio_add_page(), which requires to iterate all pages anyway. I have no > > clue what it is, so rather to ask if we can optimise it out somehow? > > Apart from pre-computing for specific cases... > > > > E.g. can pages of a single bvec segment be both in and out of a working > > set? (i.e. PageWorkingset(page)). > > Adding Johannes for the PageWorkingset logic, which keeps confusing me > everytime I look at it. I think it is intended to deal with pages > being swapped out and in, and doesn't make much sense to look at in > any form for direct I/O, but as said I'm rather confused by this code. Correct, it's only interesting for pages under LRU management - page cache and swap pages. It should not matter for direct IO. The VM uses the page flag to tell the difference between cold faults (empty cache startup e.g.), and thrashing pages which are being read back not long after they have been reclaimed. This influences reclaim behavior, but can also indicate a general lack of memory. The BIO_WORKINGSET flag is for the latter. To calculate the time wasted by a lack of memory (memory pressure), we measure the total time processes wait for thrashing pages. Usually that time is dominated by waiting for in-flight io to complete and pages to become uptodate. These waits are annotated on the page cache side. However, in some cases, the IO submission path itself can block for extended periods - if the device is congested or submissions are throttled due to cgroup policy. To capture those waits, the bio is flagged when it's for thrashing pages, and then submit_bio() will report submission time of that bio as a thrashing-related delay. [ Obviously, in theory bios could have a mix of thrashing and non-thrashing pages, and the submission stall could have occurred even without the thrashing pages. But in practice we have locality, where groups of pages tend to be accessed/reclaimed/refaulted together. The assumption that the whole bio is due to thrashing when we see the first thrashing page is a workable simplification. ] HTH