Hi all, This patchset builds off of Kent Overstreet's thread_with_file code to deliver live information about filesystem health events to userspace. This is done by creating a twf file and hooking internal operations so that the event information can be queued to the twf without stalling the kernel if the twf client program is nonresponsive. This is a private ioctl, so events are expressed using simple json objects so that we can enrich the output later on without having to rev a ton of C structs. In userspace, we create a new daemon program that will read the json event objects and initiate repairs automatically. This daemon is managed entirely by systemd and will not block unmounting of the filesystem unless repairs are ongoing. It is autostarted via some horrible udev rules. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=health-monitoring xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring fstests git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=health-monitoring --- Commits in this patchset: * xfs: create debugfs uuid aliases * xfs: create hooks for monitoring health updates * xfs: create a filesystem shutdown hook * xfs: create hooks for media errors * iomap, filemap: report buffered read and write io errors to the filesystem * iomap: report directio read and write errors to callers * xfs: create file io error hooks * xfs: create a special file to pass filesystem health to userspace * xfs: create event queuing, formatting, and discovery infrastructure * xfs: report metadata health events through healthmon * xfs: report shutdown events through healthmon * xfs: report media errors through healthmon * xfs: report file io errors through healthmon * xfs: allow reconfiguration of the health monitoring device * xfs: add media error reporting ioctl * xfs: send uevents when mounting and unmounting a filesystem --- Documentation/filesystems/vfs.rst | 7 fs/iomap/buffered-io.c | 26 + fs/iomap/direct-io.c | 4 fs/xfs/Kconfig | 8 fs/xfs/Makefile | 7 fs/xfs/libxfs/xfs_fs.h | 31 + fs/xfs/libxfs/xfs_health.h | 47 + fs/xfs/libxfs/xfs_healthmon.schema.json | 595 +++++++++++++ fs/xfs/xfs_aops.c | 2 fs/xfs/xfs_file.c | 167 ++++ fs/xfs/xfs_file.h | 36 + fs/xfs/xfs_fsops.c | 57 + fs/xfs/xfs_fsops.h | 14 fs/xfs/xfs_health.c | 202 +++++ fs/xfs/xfs_healthmon.c | 1372 +++++++++++++++++++++++++++++++ fs/xfs/xfs_healthmon.h | 102 ++ fs/xfs/xfs_ioctl.c | 7 fs/xfs/xfs_linux.h | 3 fs/xfs/xfs_mount.h | 13 fs/xfs/xfs_notify_failure.c | 137 +++ fs/xfs/xfs_notify_failure.h | 44 + fs/xfs/xfs_super.c | 55 + fs/xfs/xfs_trace.c | 4 fs/xfs/xfs_trace.h | 369 ++++++++ include/linux/fs.h | 4 include/linux/iomap.h | 2 26 files changed, 3301 insertions(+), 14 deletions(-) create mode 100644 fs/xfs/libxfs/xfs_healthmon.schema.json create mode 100644 fs/xfs/xfs_healthmon.c create mode 100644 fs/xfs/xfs_healthmon.h