Re: [PATCH v3 6/6] NFSD: Repeal and replace the READ_PLUS implementation

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Tue, 19 Jul 2022 22:10:49 +0100

On Tue, Jul 19, 2022 at 04:24:18PM -0400, Anna Schumaker wrote:
> On Tue, Jul 19, 2022 at 1:21 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > But I also thought the purpose of READ_PLUS was to help clients
> > preserve unallocated extents in files during copy operations.
> > An unallocated extent is not the same as an allocated extent
> > that has zeroes written into it. IIUC this new logic does not
> > distinguish between those two cases at all. (And please correct
> > me if this is really not the goal of READ_PLUS).
> 
> I wasn't aware of this as a goal of READ_PLUS. As of right now, Linux
> doesn't really have a way to punch holes into pagecache data, so we
> and up needing to zero-fill on the client side during decoding.

I've proven myself unqualified to opine on how NFS should be doing
things in the past ... so let me see if I understand how NFS works
for this today.

Userspace issues a read(), the VFS allocates some pages to cache the
data and calls ->readahead() to get the filesystem to fill those pages.
NFS uses READ_PLUS to get the data and the server says "this is a hole,
no data for you", at which point NFS has to call memset() because the
page cache does not have the ability to represent holes?

If so, that pretty much matches how block filesystems work.  Except that
block filesystems know the "layout" of the file; whether they use iomap
or buffer_heads, they can know this without doing I/O (some filesystems
like ext2 delay knowing the layout of the file until an I/O happens,
but then they cache it).

So I think Linux is currently built on assuming the filesystem knows
where its holes are, rather than informing the page cache about its holes.
Could we do better?  Probably!  I'd be interested in seeing what happens
if we add support for "this page is in a hole" to the page cache.
I'd also be interested in seeing how things would change if we had
filesystems provide their extent information to the VFS and have the VFS
handle holes all by itself without troubling the filesystem.  It'd require
network filesystems to invalidate the VFS's knowledge of extents if
another client modifies the file.  I haven't thought about it deeply.