Re: [PATCH] NFSD: trim reads past NFS_OFFSET_MAX

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Sun, 23 Jan 2022 17:35:17 +0000

> On Jan 22, 2022, at 3:30 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Sat, 2022-01-22 at 20:15 +0000, Chuck Lever III wrote:
>> 
>>> On Jan 22, 2022, at 1:27 PM, Trond Myklebust
>>> <trondmy@xxxxxxxxxxxxxxx> wrote:
>>> 
>>> On Sat, 2022-01-22 at 17:05 +0000, Chuck Lever III wrote:
>>>> 
>>>>> On Jan 22, 2022, at 7:47 AM, Dan Aloni <dan.aloni@xxxxxxxxxxxx>
>>>>> wrote:
>>>>> 
>>>>> On Fri, Jan 21, 2022 at 10:32:28PM +0000, Chuck Lever III
>>>>> wrote:
>>>>>>> On Jan 21, 2022, at 1:50 PM, Dan Aloni
>>>>>>> <dan.aloni@xxxxxxxxxxxx>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Due to change 8cfb9015280d ("NFS: Always provide aligned
>>>>>>> buffers to the
>>>>>>> RPC read layers"), a read of 0xfff is aligned up to server
>>>>>>> rsize of
>>>>>>> 0x1000.
>>>>>>> 
>>>>>>> As a result, in a test where the server has a file of size
>>>>>>> 0x7fffffffffffffff, and the client tries to read from the
>>>>>>> offset
>>>>>>> 0x7ffffffffffff000, the read causes loff_t overflow in the
>>>>>>> server and it
>>>>>>> returns an NFS code of EINVAL to the client. The client as
>>>>>>> a
>>>>>>> result
>>>>>>> indefinitely retries the request.
>>>>>> 
>>>>>> An infinite loop in this case is a client bug.
>>>>>> 
>>>>>> Section 3.3.6 of RFC 1813 permits the NFSv3 READ procedure
>>>>>> to return NFS3ERR_INVAL. The READ entry in Table 6 of RFC
>>>>>> 5661 permits the NFSv4 READ operation to return
>>>>>> NFS4ERR_INVAL.
>>>>>> 
>>>>>> Was the client side fix for this issue rejected?
>>>>> 
>>>>> Yeah, see Trond's response in
>>>>> 
>>>>>  
>>>>> https://lore.kernel.org/linux-nfs/fa9974724216c43f9bdb3fd39555d398fde11e59.camel@xxxxxxxxxxxxxxx/
>>>>> 
>>>>> So it is both a client and server bugs?
>>>> 
>>>> Splitting hairs, but yes there are issues on both sides
>>>> IMO. Bad behavior due to bugs on both sides is actually
>>>> not uncommon.
>>>> 
>>>> Trond is correct that the server is not dealing totally
>>>> correctly with the range of values in a READ request.
>>>> 
>>>> However, as I pointed out, the specification permits NFS
>>>> servers to return NFS[34]ERR_INVAL on READ. And in fact,
>>>> there is already code in the NFSv4 READ path that returns
>>>> INVAL, for example:
>>>> 
>>>>  785         if (read->rd_offset >= OFFSET_MAX)
>>>>  786                 return nfserr_inval;
>>>> 
>>>> I'm not sure the specifications describe precisely when
>>>> the server /must/ return INVAL, but the client needs to
>>>> be prepared to handle it reasonably. If INVAL results in
>>>> an infinite loop, then that's a client bug.
>>>> 
>>>> IMO changing the alignment for that case is a band-aid.
>>>> The underlying looping behavior is what is the root
>>>> problem. (So... I agree with Trond's NACK, but for
>>>> different reasons).
>>> 
>>> If I'm reading Dan's test case correctly, the client is trying to
>>> read
>>> a full page of 0x1000 bytes starting at offset 0x7fffffffffffff000.
>>> That means the end offset for that read is 0x7fffffffffffff000 +
>>> 0x1000
>>> - 1 = 0x7fffffffffffffff.
>>> 
>>> IOW: as far as the server is concerned, there is no loff_t overflow
>>> on
>>> either the start or end offset and so there is no reason for it to
>>> return NFS4ERR_INVAL.
>> 
>> Yep, I agree there's server misbehavior, and I think Dan's
>> server fix is on point.
>> 
>> I would like to know why the client is looping, though. INVAL
>> is a valid response the Linux server already uses in other
>> cases and by itself should not trigger a READ retry.
>> 
>> After checking the relevant XDR definitions, an NFS READ error
>> response doesn't include the EOF flag, so I'm a little mystified
>> why the client would need to retry after receiving INVAL.
> 
> While we could certainly add that error to nfs_error_is_fatal(), the
> question is why the client should need to handle NFS4ERR_INVAL if it is
> sending valid arguments?

As I said:

I agree that Dan's test case is sending values in a range
that NFSD should handle without error. That does need to
be fixed.

However, there are other instances where NFSD returns INVAL
to a READ (and it has done so for a long while). Those cases
really mustn't trigger an unterminating loop, especially
since a ^C is not likely to unblock the application. That's
why I'm still concerned about behavior when a server returns
INVAL on a READ.

> 15.1.1.4.  NFS4ERR_INVAL (Error Code 22)
> 
>   The arguments for this operation are not valid for some reason, even
>   though they do match those specified in the XDR definition for the
>   request.
> 
> 
> Sure... What does that mean, and what do I do?

Let me try to paraphrase:

A. RFC 1813 and 8881 permit servers to return INVAL on READ,
   but do not specify under which conditions to use it. This
   ambiguity might be reason for a server implementation to
   avoid that status code with READ. Have you considered
   filing an errata?

B. Though the RFCs permit servers to return INVAL on READ,
   the Linux NFS client does not support it. The client is
   not spec-compliant in this regard, but that's because of
   the ambiguity described in A.

C. Therefore the Linux NFS client treats INVAL on READ as
   unexpected input.

I claim that when confronted with unexpected input (of any
form) a "good quality" client implementation should avoid
pathological behavior like unterminating loops.... That
behavior is both an attack surface and potentially a
problem if the client has to be rebooted to fully recover.

The specific behavior of returning INVAL on READ is being
addressed in the Linux server, but not root-causing and
addressing the client's response to this behavior leaves a
large set of potential issues in this same class.

--
Chuck Lever