On Wed, Sep 14, 2016 at 12:02:19PM +0200, Carlos Maiolino wrote: > On Wed, Sep 14, 2016 at 11:23:34AM +1000, Dave Chinner wrote: > > Ok, I had to update this for the change in retry timeout values from > > Eric, so I went and fixed all the other things I thought needed > > fixing, too. New patch below.... > > > > Hi, thanks, this looks good to me, with one exception described below. > > > Dave. > > -- > > Dave Chinner > > david@xxxxxxxxxxxxx > > > > xfs: Document error handlers behavior > > > > From: Carlos Maiolino <cmaiolino@xxxxxxxxxx> > > > > + -error handlers: > > + Defines the behavior for a specific error. > > + > > +The filesystem behavior during an error can be set via sysfs files, Each > > +error handler works independently, the first condition met by and error handler > > +for a specific class will cause the error to be propagated rather than reset and > > +retried. > > + > > +The action taken by the filesystem when the error is propagated is context > > +dependent - it may cause a shut down in the case of an unrecoverable error, > > +it may be reported back to userspace, or it may even be ignored because > > +there's nothing useful we can with the error or anyone we can report it to (e.g. > > "there's nothing useful we can do with the error" > > > +during unmount). > > Also, I apologize if I misunderstand it, but being ignored doesn't look a proper > description here, it sounds to me something like 'we ignore the error and tell > nobody about it", in unmount example, we shut down the filesystem if any error > happens, for me it doesn't sound like ignoring an error, but I might be > interpreting it in the wrong way. I think you're making the assumption that the only way we handle errors once retries are exhausted is to trigger a filesystem shutdown. That assumption was repeated throughout the documentation. While that may be true for /metadata write IO errors/, it is not true for the generic error handling case. e.g. if we extend it to memory allocation contexts, we may end up returning ENOMEM to userspace. Or, in certain contexts, we might be able to fall back to doing a single operation at a time using the stack for storage, in which case there is no reason at all to report the allocation failure to anyone. The infrastructure is generic, as is the documentation, and so it shouldn't assume anything about what is going to happen once the retries are exhausted and the error is propagated upwards. What happens with that error after it is returned is a subsystem and context dependent behaviour, not something that is defined by the error retry configuration infrastructure.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs