On Tue, Mar 19, 2024 at 10:48:19AM -0700, Darrick J. Wong wrote: > On Tue, Mar 19, 2024 at 09:45:59AM +1100, Dave Chinner wrote: > > From: Christoph Hellwig <hch@xxxxxx> > > > > Instead of allocating the folios manually using the bulk page > > allocator and then using vm_map_page just use vmalloc to allocate > > the entire buffer - vmalloc will use the bulk allocator internally > > if it fits. > > > > With this the b_folios array can go away as well as nothing uses it. > > > > [dchinner: port to folio based buffers.] > > > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > --- > > fs/xfs/xfs_buf.c | 164 ++++++++++++------------------------------- > > fs/xfs/xfs_buf.h | 2 - > > fs/xfs/xfs_buf_mem.c | 9 +-- > > 3 files changed, 45 insertions(+), 130 deletions(-) > > > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > > index 303945554415..6d6bad80722e 100644 > > --- a/fs/xfs/xfs_buf.c > > +++ b/fs/xfs/xfs_buf.c > > @@ -282,29 +282,6 @@ _xfs_buf_alloc( > > return 0; > > } > > > > -static void > > -xfs_buf_free_folios( > > - struct xfs_buf *bp) > > -{ > > - uint i; > > - > > - ASSERT(bp->b_flags & _XBF_FOLIOS); > > - > > - if (xfs_buf_is_vmapped(bp)) > > - vm_unmap_ram(bp->b_addr, bp->b_folio_count); > > - > > - for (i = 0; i < bp->b_folio_count; i++) { > > - if (bp->b_folios[i]) > > - __folio_put(bp->b_folios[i]); > > - } > > - mm_account_reclaimed_pages(bp->b_folio_count); > > - > > - if (bp->b_folios != bp->b_folio_array) > > - kfree(bp->b_folios); > > - bp->b_folios = NULL; > > - bp->b_flags &= ~_XBF_FOLIOS; > > -} > > - > > static void > > xfs_buf_free_callback( > > struct callback_head *cb) > > @@ -323,13 +300,22 @@ xfs_buf_free( > > > > ASSERT(list_empty(&bp->b_lru)); > > > > - if (xfs_buftarg_is_mem(bp->b_target)) > > + if (xfs_buftarg_is_mem(bp->b_target)) { > > xmbuf_unmap_folio(bp); > > - else if (bp->b_flags & _XBF_FOLIOS) > > - xfs_buf_free_folios(bp); > > - else if (bp->b_flags & _XBF_KMEM) > > - kfree(bp->b_addr); > > + goto free; > > + } > > > > + if (!(bp->b_flags & _XBF_KMEM)) > > + mm_account_reclaimed_pages(bp->b_folio_count); > > Echoing hch's statement about the argument being passed to > mm_account_reclaimed_pages needing to be fed units of base pages, not > folios. > > > + > > + if (bp->b_flags & _XBF_FOLIOS) > > + __folio_put(kmem_to_folio(bp->b_addr)); > > Is it necessary to use folio_put instead of the __ version like hch said > earlier? Both fixed. > > > + else > > + kvfree(bp->b_addr); > > + > > + bp->b_flags &= _XBF_KMEM | _XBF_FOLIOS; > > Shouldn't this be: > > bp->b_flags &= ~(_XBF_KMEM | _XBF_FOLIOS); ? Yes. Good catch. > > @@ -377,14 +361,15 @@ xfs_buf_alloc_folio( > > struct xfs_buf *bp, > > gfp_t gfp_mask) > > { > > + struct folio *folio; > > int length = BBTOB(bp->b_length); > > int order = get_order(length); > > > > - bp->b_folio_array[0] = folio_alloc(gfp_mask, order); > > - if (!bp->b_folio_array[0]) > > + folio = folio_alloc(gfp_mask, order); > > + if (!folio) > > return false; > > > > - bp->b_folios = bp->b_folio_array; > > + bp->b_addr = folio_address(folio); > > bp->b_folio_count = 1; > > bp->b_flags |= _XBF_FOLIOS; > > return true; > > @@ -400,15 +385,11 @@ xfs_buf_alloc_folio( > > * contiguous memory region that we don't have to map and unmap to access the > > * data directly. > > * > > - * The second type of buffer is the multi-folio buffer. These are *always* made > > - * up of single page folios so that they can be fed to vmap_ram() to return a > > - * contiguous memory region we can access the data through. > > - * > > - * We don't use high order folios for this second type of buffer (yet) because > > - * having variable size folios makes offset-to-folio indexing and iteration of > > - * the data range more complex than if they are fixed size. This case should now > > - * be the slow path, though, so unless we regularly fail to allocate high order > > - * folios, there should be little need to optimise this path. > > + * The second type of buffer is the vmalloc()d buffer. This provides the buffer > > + * with the required contiguous memory region but backed by discontiguous > > + * physical pages. vmalloc() typically doesn't fail, but it can and so we may > > + * need to wrap the allocation in a loop to prevent low memory failures and > > + * shutdowns. > > Where's the loop now? Is that buried under __vmalloc somewhere? I thought I'd added __GFP_NOFAIL to the __vmalloc() gfp mask to make it loop. I suspect I lost it at some point when rebasing either this or the (now merged) kmem.[ch] removal patchset. Well spotted, I'll fix that up. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx