> On Aug 27, 2019, at 11:15 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Tue, 2019-08-27 at 10:59 -0400, bfields@xxxxxxxxxxxx wrote: >> On Tue, Aug 27, 2019 at 10:58:19AM -0400, bfields@xxxxxxxxxxxx wrote: >>> On Tue, Aug 27, 2019 at 02:53:01PM +0000, Trond Myklebust wrote: >>>> The one problem is that the looping forever client can cause >>>> other >>>> clients to loop forever on their otherwise successful writes on >>>> other >>>> files. >>> >>> Yeah, that's the case I was wondering about. >>> >>>> That's bad, but again, that's due to client behaviour that is >>>> toxic even today. >>> >>> So my worry was that if write errors are rare and the consequences >>> of >>> the single client looping forever are relatively mild, then there >>> might >>> be deployed clients that get away with that behavior. >>> >>> But maybe the behavior's a lot more "toxic" than I imagined, hence >>> unlikely to be very common. >> >> (And, to be clear, I like the idea, just making sure I'm not >> overlooking >> any problems....) >> > I'm open to other suggestions, but I'm having trouble finding one that > can scale correctly (i.e. not require per-client tracking), prevent > silent corruption (by causing clients to miss errors), while not > relying on optional features that may not be implemented by all NFSv3 > clients (e.g. per-file write verifiers are not implemented by *BSD). > > That said, it seems to me that to do nothing should not be an option, > as that would imply tolerating silent corruption of file data. Agree, we should move forward. I'm not saying "do nothing," I'm just trying to understand what is improved and what is still left to do (maybe nothing). -- Chuck Lever