On Thu, May 05, 2016 at 10:11:07AM -0400, Brian Foster wrote: > On Wed, May 04, 2016 at 05:43:13PM +0200, Carlos Maiolino wrote: > > This is the new revision of this patchset, according to last comments. > > > > This patchset is aimed to implement a configurable error behavior in XFS, and > > most of the design has been done by Dave, so, that's why I kept his signed-off > > in the patches. > > > > This new revision has the detailed changelog written on each patch, but the > > major changes are: > > > > - Detailed changelog by-patch and description fixed to become > > (hopefuly) more clear > > - kept fail_at_unmount as a sysfs attribute > > > > > > Regarding fail_at_unmount, I left it almost exactly as Dave's design, giving his > > comments on the last revision, although, I still think there is no need to keep > > it as a per-error granularity, so, I was wondering if a single, global option in > > /sys/fs/xfs/<dev>/error/fail_at_unmount wouldn't suffice, but, this will require > > a new place to store the value inside kernel, instead of keeping it inside > > struct xfs_error_cfg, or maybe use the same structure but use it outside of the > > m_error_cfg array? > > > > I agree with regard to the granularity of fail_at_unmount. This was > brought up previously: > > http://oss.sgi.com/archives/xfs/2016-02/msg00558.html > > ... and I haven't heard a use case for per-error granularity. Hi, yes, my comment was based on our previous discussion, my apologies to not have made it clear. > > I suggest just to pull it out of the error classification stuff entirely > and place it under xfs_mount. E.g., at the same level as "fail_writes" > (but not a DEBUG mode only option). > > I'm also wondering whether we need more mechanism for the > fail_at_unmount behavior. For example, instead of defining > XFS_MOUNT_UNMOUNTING, could we just call a function that resets > max_retries (of each class) to 0 in the unmount path? Then maybe call > the mount tunable retry_on_unmount or something like that. Thoughts? > I don't oppose to that, although, having a flag like XFS_MOUNT_UNMOUNTING, might be useful in the future, but still, wouldn't be better this single flag, instead of walk through all classes/errors resetting the max_retries? It sounds as granular as having fail_at_unmount inside each error, despite the fact it's not exposed to user-space, we will need to interact over each max_retries to actually shutdown the filesystem during unmount, which, is also error-prone IMHO. It also depends on how granular we will implement fail_at_unmount. If it's a single global option, resetting all max_retries works, otherwise it might not work, for example, if we decide to have fail_at_unmount for each class, we might need to reset max_retries only in specific errors, which will increase the complexity of the code. Well, hope my comments make sense, just giving my $0.02 :) cheers > Brian > > > First 6 patches are ready, the fail_at_unmount one, need to be re-worked if we > > want it in a less granular way, but until now I don't think we reached any > > decision about how it should be implemented. > > > > fs/xfs/xfs_buf.h | 22 ++++ > > fs/xfs/xfs_buf_item.c | 126 ++++++++++++++-------- > > fs/xfs/xfs_mount.c | 19 +++- > > fs/xfs/xfs_mount.h | 32 ++++++ > > fs/xfs/xfs_sysfs.c | 283 +++++++++++++++++++++++++++++++++++++++++++++++++- > > fs/xfs/xfs_sysfs.h | 3 + > > 6 files changed, 437 insertions(+), 48 deletions(-) > > > > -- > > 2.4.11 > > > > _______________________________________________ > > xfs mailing list > > xfs@xxxxxxxxxxx > > http://oss.sgi.com/mailman/listinfo/xfs > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs -- Carlos _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs