Re: [PATCH 18/37] ext4: Convert ext4 to read_folio

"Theodore Ts'o" <tytso@xxxxxxx> · Mon, 9 May 2022 16:16:19 -0400

On Mon, May 09, 2022 at 03:07:20PM +0100, Matthew Wilcox wrote:
> 
> I'm probably answering these emails out of order,

No worries, I've been reviewing these patches out of order myself.
:-)

> but the page
> cache is absolutely not supposed to be creating large folios for
> filesystems that haven't indicated their support for such by calling
> mapping_set_large_folios().

I think my concern is that at some point in the future, ext4 probably
*will* want to enable large folios --- and we may want to do so
selectively.  e.g., just on the read-path, and assume that someone
will break apart large folios to individual pages on the write path,
for example.

The question is when do we add all of these sanity check asserts ---
at the point when ext4 starts making the transition from large folio
unaware, to large folio kind-of-aware, and hope we don't miss any of
these interfaces?  Or add those sanity check asserts now, so we get
reminded that some of these functions may need fixing up when we start
adding large folio support to the file system?

Also, what's the intent for when the MM layer would call
aops->read_folio() with the intent to fill a huge folio, versus
calling aops->readahead()?  After all, when we take a page fault,
it'll be either a 4k page, right?  We currently don't support
file-backed huge pages; is there a plan to change this?

		    	  	    	    - Ted

P.S.  On a somewhat unrelated issue, if we have a really large folio
caused by a 4MB readahead because CIFS really wanted a huge readahead
size because of the network setup overhead --- and then a single 4k
page gets dirtied, I imagine the VM subsystem *want* to break apart
that 4MB folio so that we know that only that single 4k page was
dirtied, and not require writing back a huge amount of clean 4k pages
just because we didn't track dirtiness at the right granularity,
right?