Hi Dan- Dropping stable@. No need to copy them on this discussion. Also, you don't need to actually cc: stable when you repost. Just add the tag as you did below. Automation will pick up the patch when it goes into mainline. More below. > On Jan 23, 2022, at 12:03 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > >> >> On Jan 23, 2022, at 10:29 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: >> >> On Sun, 2022-01-23 at 11:50 +0200, Dan Aloni wrote: >>> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to >>> the >>> RPC read layers") on the client, a read of 0xfff is aligned up to >>> server >>> rsize of 0x1000. >>> >>> As a result, in a test where the server has a file of size >>> 0x7fffffffffffffff, and the client tries to read from the offset >>> 0x7ffffffffffff000, the read causes loff_t overflow in the server and >>> it >>> returns an NFS code of EINVAL to the client. The client as a result >>> indefinitely retries the request. >>> >>> This fixes the issue at server side by trimming reads past >>> NFS_OFFSET_MAX. It also adds a missing check for out of bound offset >>> in >>> NFSv3, copying a similar check from NFSv4.x. >>> >>> Cc: <stable@xxxxxxxxxxxxxxx> >>> Signed-off-by: Dan Aloni <dan.aloni@xxxxxxxxxxxx> >>> --- >>> fs/nfsd/nfs4proc.c | 3 +++ >>> fs/nfsd/vfs.c | 6 ++++++ >>> 2 files changed, 9 insertions(+) >>> >>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c >>> index 486c5dba4b65..816bdf212559 100644 >>> --- a/fs/nfsd/nfs4proc.c >>> +++ b/fs/nfsd/nfs4proc.c >>> @@ -785,6 +785,9 @@ nfsd4_read(struct svc_rqst *rqstp, struct >>> nfsd4_compound_state *cstate, >>> if (read->rd_offset >= OFFSET_MAX) >>> return nfserr_inval; >>> >>> + if (unlikely(read->rd_offset + read->rd_length > OFFSET_MAX)) >>> + read->rd_length = OFFSET_MAX - read->rd_offset; >>> + >>> trace_nfsd_read_start(rqstp, &cstate->current_fh, >>> read->rd_offset, read->rd_length); >>> >>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c >>> index 738d564ca4ce..ad4df374433e 100644 >>> --- a/fs/nfsd/vfs.c >>> +++ b/fs/nfsd/vfs.c >>> @@ -1045,6 +1045,12 @@ __be32 nfsd_read(struct svc_rqst *rqstp, >>> struct svc_fh *fhp, >>> struct file *file; >>> __be32 err; >>> >>> + if (unlikely(offset >= NFS_OFFSET_MAX)) >>> + return nfserr_inval; >> >> POSIX does allow you to lseek to the maximum filesize offset (sb- >>> s_maxbytes), however any subsequent read will return 0 bytes (i.e. >> eof), whereas a subsequent write will return EFBIG. > > I'm simply trying to clarify your request. > > You've stated that the Linux NFS client does not handle INVAL > responses, even though both RFC 1813 and 8881 permit it. So > are you suggesting (here) that the Linux NFS server should > not return INVAL on READs beyond the filesystem's supported > maximum file size but instead return a successful 0-byte > result with EOF set? After some thought and discussion with Solaris NFS server engineers, I think this is the best response to a READ whose range arguments go past the server's advertised maxfilesize. So can you please confirm that your final fix does: 1. The range of values that was failing before but shouldn't have, now succeeds 2. When the offset is less than maxfilesize but offset+count exceeds it, the READ should succeed but return a short result and set EOF 3. When the range is completely outside maxfilesize, the READ should succeed but return zero bytes and set EOF And I don't mind if you split the fix over two or three patches if that makes it more clear. >>> + >>> + if (unlikely(offset + *count > NFS_OFFSET_MAX)) >>> + *count = NFS_OFFSET_MAX - offset; >> >> Can we please fix this to use the actual per-filesystem file size >> limit, which is set as sb->s_maxbytes, instead of using NFS_OFFSET_MAX? >> >> Ditto for 'read' above. > > That sounds reasonable. I'm wondering whether the VFS itself will bound the range arguments relative to sb->s_maxbytes. I'm told that the kiocb iterators used in fs/nfsd/vfs.c should do that. All that NFSD has to do is ensure that the incoming client values are converted from u64 to loff_t without underflowing. So comparing @offset with OFFSET_MAX here seems like the right thing to do? We just don't want the READ to fail with INVAL. > But I wonder if there are some other > places that need the same treatment. I've confirmed that there /are/ other places that need to be fixed. I've created a series of patches that will address those. The first of those was the COMMIT patch I posted yesterday. Dan, I'd like to include your READ fixes in that series. >>> + >>> trace_nfsd_read_start(rqstp, fhp, offset, *count); >>> err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_READ, &nf); >>> if (err) > > -- > Chuck Lever -- Chuck Lever