Re: [PATCH 2/3] xfs: use folios in the buffer cache

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 19 Jan 2024 18:03:50 +1100

On Thu, Jan 18, 2024 at 05:26:24PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 19, 2024 at 09:19:40AM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > Convert the use of struct pages to struct folio everywhere. This
> > is just direct API conversion, no actual logic of code changes
> > should result.
> > 
> > Note: this conversion currently assumes only single page folios are
> > allocated, and because some of the MM interfaces we use take
> > pointers to arrays of struct pages, the address of single page
> > folios and struct pages are the same. e.g alloc_pages_bulk_array(),
> > vm_map_ram(), etc.
> > 
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
.....
> > @@ -387,9 +387,9 @@ xfs_buf_alloc_pages(
> >  	for (;;) {
> >  		long	last = filled;
> >  
> > -		filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count,
> > -						bp->b_pages);
> > -		if (filled == bp->b_page_count) {
> > +		filled = alloc_pages_bulk_array(gfp_mask, bp->b_folio_count,
> > +						(struct page **)bp->b_folios);
> 
> Ugh, pointer casting.  I suppose here is where we might want an
> alloc_folio_bulk_array that might give us successively smaller
> large-folios until b_page_count is satisfied?  (Maybe that's in the next
> patch?)

No, I explicitly chose not to do that because then converting buffer
offset to memory address becomes excitingly complex. With fixed size
folios, it's just simple math. With variable, unknown sized objects,
we either have to store the size of each object with the pointer,
or walk each object grabbing the size to determine what folio in the
buffer corresponds to a specific offset.

And it's now the slow path, so I don't really care to optimise it
that much.

> I guess you'd also need a large-folio capable vm_map_ram.  Both of
> these things sound reasonable, particularly if somebody wants to write
> us a new buffer cache for ext2rs and support large block sizes.

Maybe so, but we do not require them and I don't really have the
time or desire to try to implement something like that. And, really
what benefit do small multipage folios bring us if we still have to
vmap them?

> Assuming that one of the goals here is (say) to be able to mount a 16k
> blocksize filesystem and try to get 16k folios for the buffer cache?

The goal is that we optimistically use large folios where-ever we
have metadata buffers that are larger than a single page, regardless
of the filesystem block size.

Right now on a 4kB block size filesystem that means inode cluster
buffers (16kB for 512 byte inodes), user xattr buffers larger than a
single page, and directory blocks if the filesytsem is configure
with "-n size=X" and X is 8kB or larger.

On filesystems with block sizes larger than 4kB, it will try to use
large folios for everything but the sector sized AG headers.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx