Re: [PATCH v2 2/4] nfs/blocklayout: Use bulk page allocation APIs

Chuck Lever <chuck.lever@xxxxxxxxxx> · Sat, 22 Jun 2024 12:29:31 -0400

On Sat, Jun 22, 2024 at 07:08:12AM +0200, Christoph Hellwig wrote:
> On Fri, Jun 21, 2024 at 12:22:30PM -0400, cel@xxxxxxxxxx wrote:
> > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > 
> > nfs4_get_device_info() frequently requests more than a few pages
> > when provisioning a nfs4_deviceid_node object. Make this more
> > efficient by using alloc_pages_bulk_array(). This API is known to be
> > several times faster than an open-coded loop around alloc_page().
> > 
> > release_pages() is folio-enabled so it is also more efficient than
> > repeatedly invoking __free_pages().
> 
> This isn't really a pnfs fix, right?  Just a little optimization.

It doesn't say "fix" anywhere and doesn't include a Fixes: tag.
And subsequent patches in the series are also clearly not fixes.

I can make it more clear that this one is only an optimization.

> It does looks fine to me:
> 
> Reviewed-by: Christoph Hellwig <hch@xxxxxx>

Thank you!

> But I'd really wish if we could do better than this with lazy
> decoding in ->alloc_deviceid_node, which (at least for blocklayout)
> knows roughly how much we need to decode after the first value
> parsed.

Agreed. And it's not the only culprit in NFS and RPC of this kind
of temporary "just in case" overallocation.

> Or at least cache it if it is that frequent (which it
> really shouldn't be due to the device id cache, or am I missing
> something?)

It's not a frequent operation; it's done the first time pNFS
encounters a new block device. But the alloc_page() loop is slow and
takes and releases an IRQ spinlock repeatedly (IIRC) so it's an
opportunity for IRQs to run and delay get_device_info considerably.

-- 
Chuck Lever