Re: Issue with 8K folio size in __filemap_get_folio()

David Hildenbrand <david@xxxxxxxxxx> · Mon, 4 Dec 2023 16:09:36 +0100

On 04.12.23 00:12, Matthew Wilcox wrote:
On Sun, Dec 03, 2023 at 09:27:57PM +0000, Matthew Wilcox wrote:
I was talking with Darrick on Friday and he convinced me that this is
something we're going to need to fix sooner rather than later for the
benefit of devices with block size 8kB.  So it's definitely on my todo
list, but I haven't investigated in any detail yet.

OK, here's my initial analysis of just not putting order-1 folios
on the deferred split list.  folio->_deferred_list is only used in
mm/huge_memory.c, which makes this a nice simple analysis.

  - folio_prep_large_rmappable() initialises the list_head.  No problem,
    just don't do that for order-1 folios.
  - split_huge_page_to_list() will remove the folio from the split queue.
    No problem, just don't do that.
  - folio_undo_large_rmappable() removes it from the list if it's
    on the list.  Again, no problem, don't do that for order-1 folios.
  - deferred_split_scan() walks the list, it won't find any order-1
    folios.

  - deferred_split_folio() will add the folio to the list.  Returning
    here will avoid adding the folio to the list.  But what consequences
    will that have?  Ah.  There's only one caller of
    deferred_split_folio() and it's in page_remove_rmap() ... and it's
    only called for anon folios anyway.

So it looks like we can support order-1 folios in the page cache without
any change in behaviour since file-backed folios were never added to
the deferred split list.

I think for the pagecache it should work. In the context of [1], a total 
mapcount would likely still be possible. Anything beyond that likely 
not, if we ever care.

[1] https://lkml.kernel.org/r/20231124132626.235350-1-david@xxxxxxxxxx

--
Cheers,

David / dhildenb