> On Aug 28, 2019, at 10:16 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Wed, 2019-08-28 at 10:03 -0400, Chuck Lever wrote: >>> On Aug 28, 2019, at 10:00 AM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote: >>> >>> On Wed, Aug 28, 2019 at 09:57:25AM -0400, Chuck Lever wrote: >>>> >>>>> On Aug 28, 2019, at 9:51 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >>>>> >>>>> On Wed, 2019-08-28 at 09:48 -0400, bfields@xxxxxxxxxxxx wrote: >>>>>> On Tue, Aug 27, 2019 at 03:15:35PM +0000, Trond Myklebust wrote: >>>>>>> I'm open to other suggestions, but I'm having trouble finding one that >>>>>>> can scale correctly (i.e. not require per-client tracking), prevent >>>>>>> silent corruption (by causing clients to miss errors), while not >>>>>>> relying on optional features that may not be implemented by all NFSv3 >>>>>>> clients (e.g. per-file write verifiers are not implemented by *BSD). >>>>>>> >>>>>>> That said, it seems to me that to do nothing should not be an option, >>>>>>> as that would imply tolerating silent corruption of file data. >>>>>> >>>>>> So should we increment the boot verifier every time we discover an error >>>>>> on an asynchronous write? >>>>>> >>>>> >>>>> I think so. Otherwise, only one client will ever see that error. >>>> >>>> +1 >>>> >>>> I'm not familiar with the details of how the Linux NFS server >>>> implements the boot verifier: Will a verifier bump be effective >>>> for all file systems that server exports? >>> >>> Yes. It will be per network namespace, but that's the only limit. >>> >>>> If so, is that an acceptable cost? >>> >>> It means clients will resend all their uncommitted writes. That could >>> certainly make write errors more expensive. But maybe you've already >>> got bigger problems if you've got a full or failing disk? >> >> One full filesystem will impact the behavior of all other exported >> filesystems. That might be surprising behavior to a server administrator. >> I don't have any suggestions other than maintaining a separate verifier >> for each exported filesystem in each net namespace. >> >> > > Yeah, it's not pretty, but I think the alternative is worse. Most admins > would take rotten performance over data corruption. Again, I'm not saying we should do nothing. It doesn't seem like a per-export verifier would be much more work than a per-net-namespace verifier. > For the most part, these sorts of errors tend to be rare. EIO is certainly going to be rare, agreed. ENOSPC might not be. > If it turns > out to be a problem we could consider moving the verifier into > svc_export or something? -- Chuck Lever