Document the implementation of error handlers into sysfs. Changelog: V2: - Add a description of the precedence order of each option, focusing on the behavior of "fail_at_unmount" which was not well explained in V1 Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx> --- Documentation/filesystems/xfs.txt | 94 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt index 8146e9f..d483e0b 100644 --- a/Documentation/filesystems/xfs.txt +++ b/Documentation/filesystems/xfs.txt @@ -348,3 +348,97 @@ Removed Sysctls ---- ------- fs.xfs.xfsbufd_centisec v4.0 fs.xfs.age_buffer_centisecs v4.0 + +Error handling +============== + +XFS can act differently according with the type of error found +during its operation. The implementation introduces the following +concepts to the error handler: + + -failure speed: + Defines how fast XFS should shutdown in case of a specific + error is found during the filesystem operation. It can + shutdown immediately, after a defined number of tries, or + simply try forever, which was the old behavior and is now + set as default behavior, except during unmount time, where + in case of a error is found while unmounting, the filesystem + will shutdown. + + -error classes: + Specifies the subsystem/location where the error handlers + configure the behavior for, such as metadata or memory allocation. + + -error handlers: + Defines the behavior for a specific error. + +The filesystem behavior during an error can be set via sysfs files, where, the +errors are organized with the following structure: + + /sys/fs/xfs/<dev>/error/<class>/<error>/ + +Each directory contains: + + /sys/fs/xfs/<dev>/error/ + + fail_at_unmount (Min: 0 Default: 1 Max: 1) + Defines the global error behavior during unmount time. If set to + "1", XFS will shutdown in case of any error is found, otherwise, + if set to "0", the filesystem will indefinitely retry to cleanly + unmount the filesystem. + + <class> subdirectories + Contains specific error handlers configuration + (Ex: /sys/fs/xfs/<dev>/error/metadata). + + /sys/fs/xfs/<dev>/error/<class>/ + + The contents of this directory are <class> specific, since each <class> + might need to handle different types of errors. All <error> directory + though, contains the "default" directory, which is a global configuration + for errors not available for independent configuration. + + /sys/fs/xfs/<dev>/error/<class>/<error> + + Contains the failure speed configuration files for each specific error, + including the "default" behavior, which contains the same configuration + options as the specific errors. + + The available configurations for each error type are: + + max_retries (Min: -1 Default: -1 Max: INTMAX) + Define how many tries the filesystem is allowed to retry its + operations during the specific error, before shutdown the + filesystem. Setting this file to "-1", will set XFS to retry + forever in the specific error, setting it to "0", will make + XFS to fail immediately after the specific error is found, + while setting it to a "N" value, where N is greater than 0, + will make XFS retry "N" times before shutdown. + + retry_timeout_seconds (Min: 0 Default: 0 Max: INTMAX) + Define the amount of time (in seconds) that the filesystem is + allowed to retry its operations when the specific error is + found. "0" means no wait time. + + + + Order of precedence: + "max_retries" takes precedence over "retry_timeout_seconds", + where, "retry_timeout_seconds" will only be tested if + "max_retries" limit was not reached yet or is set to retry + forever ("-1"). If "max_retries" limit is reached, the + filesystem will shutdown, wether or not "retry_timeout_seconds" + has been reached. + + "fail_at_unmount" on the other hand, works independently of the + remainder options. It will only be tested during unmount time, + but, it will shutdown the filesystem independent of the limits + set into "max_retries" or "retry_timeout_seconds". + It has been added because sysfs configuration can't be changed + after an unmount is triggered, once the sysfs directory from + the filesystem being unmounted will be detached from the sysfs + tree, so, even if the sysadmin wants to make XFS retry forever + for any error during the filesystem operation, the filesystem + can still be properly unmounted if any error was detected and + "fail_at_unmount" is set. Otherwise, the umount process get + stuck forever. -- 2.5.5 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs