> On Jan 22, 2022, at 1:27 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Sat, 2022-01-22 at 17:05 +0000, Chuck Lever III wrote: >> >>> On Jan 22, 2022, at 7:47 AM, Dan Aloni <dan.aloni@xxxxxxxxxxxx> >>> wrote: >>> >>> On Fri, Jan 21, 2022 at 10:32:28PM +0000, Chuck Lever III wrote: >>>>> On Jan 21, 2022, at 1:50 PM, Dan Aloni <dan.aloni@xxxxxxxxxxxx> >>>>> wrote: >>>>> >>>>> Due to change 8cfb9015280d ("NFS: Always provide aligned >>>>> buffers to the >>>>> RPC read layers"), a read of 0xfff is aligned up to server >>>>> rsize of >>>>> 0x1000. >>>>> >>>>> As a result, in a test where the server has a file of size >>>>> 0x7fffffffffffffff, and the client tries to read from the >>>>> offset >>>>> 0x7ffffffffffff000, the read causes loff_t overflow in the >>>>> server and it >>>>> returns an NFS code of EINVAL to the client. The client as a >>>>> result >>>>> indefinitely retries the request. >>>> >>>> An infinite loop in this case is a client bug. >>>> >>>> Section 3.3.6 of RFC 1813 permits the NFSv3 READ procedure >>>> to return NFS3ERR_INVAL. The READ entry in Table 6 of RFC >>>> 5661 permits the NFSv4 READ operation to return >>>> NFS4ERR_INVAL. >>>> >>>> Was the client side fix for this issue rejected? >>> >>> Yeah, see Trond's response in >>> >>> >>> https://lore.kernel.org/linux-nfs/fa9974724216c43f9bdb3fd39555d398fde11e59.camel@xxxxxxxxxxxxxxx/ >>> >>> So it is both a client and server bugs? >> >> Splitting hairs, but yes there are issues on both sides >> IMO. Bad behavior due to bugs on both sides is actually >> not uncommon. >> >> Trond is correct that the server is not dealing totally >> correctly with the range of values in a READ request. >> >> However, as I pointed out, the specification permits NFS >> servers to return NFS[34]ERR_INVAL on READ. And in fact, >> there is already code in the NFSv4 READ path that returns >> INVAL, for example: >> >> 785 if (read->rd_offset >= OFFSET_MAX) >> 786 return nfserr_inval; >> >> I'm not sure the specifications describe precisely when >> the server /must/ return INVAL, but the client needs to >> be prepared to handle it reasonably. If INVAL results in >> an infinite loop, then that's a client bug. >> >> IMO changing the alignment for that case is a band-aid. >> The underlying looping behavior is what is the root >> problem. (So... I agree with Trond's NACK, but for >> different reasons). > > If I'm reading Dan's test case correctly, the client is trying to read > a full page of 0x1000 bytes starting at offset 0x7fffffffffffff000. > That means the end offset for that read is 0x7fffffffffffff000 + 0x1000 > - 1 = 0x7fffffffffffffff. > > IOW: as far as the server is concerned, there is no loff_t overflow on > either the start or end offset and so there is no reason for it to > return NFS4ERR_INVAL. Yep, I agree there's server misbehavior, and I think Dan's server fix is on point. I would like to know why the client is looping, though. INVAL is a valid response the Linux server already uses in other cases and by itself should not trigger a READ retry. After checking the relevant XDR definitions, an NFS READ error response doesn't include the EOF flag, so I'm a little mystified why the client would need to retry after receiving INVAL. -- Chuck Lever