Re: [PATCH v3 2/9] mm: optimize do_wp_page() for fresh pages in local LRU pagevecs

Vlastimil Babka <vbabka@xxxxxxx> · Wed, 9 Mar 2022 18:53:23 +0100



On 1/31/22 17:29, David Hildenbrand wrote:
> For example, if a page just got swapped in via a read fault, the LRU
> pagevecs might still hold a reference to the page. If we trigger a
> write fault on such a page, the additional reference from the LRU
> pagevecs will prohibit reusing the page.
> 
> Let's conditionally drain the local LRU pagevecs when we stumble over a
> !PageLRU() page. We cannot easily drain remote LRU pagevecs and it might
> not be desirable performance-wise. Consequently, this will only avoid
> copying in some cases.
> 
> Add a simple "page_count(page) > 3" check first but keep the
> "page_count(page) > 1 + PageSwapCache(page)" check in place, as
> we want to minimize cases where we remove a page from the swapcache but
> won't be able to reuse it, for example, because another process has it
> mapped R/O, to not affect reclaim.
> 
> We cannot easily handle the following cases and we will always have to
> copy:
> 
> (1) The page is referenced in the LRU pagevecs of other CPUs. We really
>     would have to drain the LRU pagevecs of all CPUs -- most probably
>     copying is much cheaper.
> 
> (2) The page is already PageLRU() but is getting moved between LRU
>     lists, for example, for activation (e.g., mark_page_accessed()),
>     deactivation (MADV_COLD), or lazyfree (MADV_FREE). We'd have to
>     drain mostly unconditionally, which might be bad performance-wise.
>     Most probably this won't happen too often in practice.
> 
> Note that there are other reasons why an anon page might temporarily not
> be PageLRU(): for example, compaction and migration have to isolate LRU
> pages from the LRU lists first (isolate_lru_page()), moving them to
> temporary local lists and clearing PageLRU() and holding an additional
> reference on the page. In that case, we'll always copy.
> 
> This change seems to be fairly effective with the reproducer [1] shared
> by Nadav, as long as writeback is done synchronously, for example, using
> zram. However, with asynchronous writeback, we'll usually fail to free the
> swapcache because the page is still under writeback: something we cannot
> easily optimize for, and maybe it's not really relevant in practice.
> 
> [1] https://lkml.kernel.org/r/0480D692-D9B2-429A-9A88-9BBA1331AC3A@xxxxxxxxx
> 
> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>