Re: [PATCH 0/3] Handling NFSv3 I/O errors in knfsd

Chuck Lever <chuck.lever@xxxxxxxxxx> · Wed, 28 Aug 2019 10:03:01 -0400

> On Aug 28, 2019, at 10:00 AM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
> 
> On Wed, Aug 28, 2019 at 09:57:25AM -0400, Chuck Lever wrote:
>> 
>> 
>>> On Aug 28, 2019, at 9:51 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>>> 
>>> On Wed, 2019-08-28 at 09:48 -0400, bfields@xxxxxxxxxxxx wrote:
>>>> On Tue, Aug 27, 2019 at 03:15:35PM +0000, Trond Myklebust wrote:
>>>>> I'm open to other suggestions, but I'm having trouble finding one that
>>>>> can scale correctly (i.e. not require per-client tracking), prevent
>>>>> silent corruption (by causing clients to miss errors), while not
>>>>> relying on optional features that may not be implemented by all NFSv3
>>>>> clients (e.g. per-file write verifiers are not implemented by *BSD).
>>>>> 
>>>>> That said, it seems to me that to do nothing should not be an option,
>>>>> as that would imply tolerating silent corruption of file data.
>>>> 
>>>> So should we increment the boot verifier every time we discover an error
>>>> on an asynchronous write?
>>>> 
>>> 
>>> I think so. Otherwise, only one client will ever see that error.
>> 
>> +1
>> 
>> I'm not familiar with the details of how the Linux NFS server
>> implements the boot verifier: Will a verifier bump be effective
>> for all file systems that server exports?
> 
> Yes.  It will be per network namespace, but that's the only limit.
> 
>> If so, is that an acceptable cost?
> 
> It means clients will resend all their uncommitted writes.  That could
> certainly make write errors more expensive.  But maybe you've already
> got bigger problems if you've got a full or failing disk?

One full filesystem will impact the behavior of all other exported
filesystems. That might be surprising behavior to a server administrator.
I don't have any suggestions other than maintaining a separate verifier
for each exported filesystem in each net namespace.

--
Chuck Lever