[PATCHSET 5/5] xfs: live health monitoring of filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

This patchset builds off of Kent Overstreet's thread_with_file code to
deliver live information about filesystem health events to userspace.
This is done by creating a twf file and hooking internal operations so
that the event information can be queued to the twf without stalling the
kernel if the twf client program is nonresponsive.  This is a private
ioctl, so events are expressed using simple json objects so that we can
enrich the output later on without having to rev a ton of C structs.

In userspace, we create a new daemon program that will read the json
event objects and initiate repairs automatically.  This daemon is
managed entirely by systemd and will not block unmounting of the
filesystem unless repairs are ongoing.  It is autostarted via some
horrible udev rules.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=health-monitoring

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=health-monitoring
---
Commits in this patchset:
 * xfs: create debugfs uuid aliases
 * xfs: create hooks for monitoring health updates
 * xfs: create a filesystem shutdown hook
 * xfs: create hooks for media errors
 * iomap, filemap: report buffered read and write io errors to the filesystem
 * iomap: report directio read and write errors to callers
 * xfs: create file io error hooks
 * xfs: create a special file to pass filesystem health to userspace
 * xfs: create event queuing, formatting, and discovery infrastructure
 * xfs: report metadata health events through healthmon
 * xfs: report shutdown events through healthmon
 * xfs: report media errors through healthmon
 * xfs: report file io errors through healthmon
 * xfs: allow reconfiguration of the health monitoring device
 * xfs: add media error reporting ioctl
 * xfs: send uevents when mounting and unmounting a filesystem
---
 Documentation/filesystems/vfs.rst       |    7 
 fs/iomap/buffered-io.c                  |   26 +
 fs/iomap/direct-io.c                    |    4 
 fs/xfs/Kconfig                          |    8 
 fs/xfs/Makefile                         |    7 
 fs/xfs/libxfs/xfs_fs.h                  |   31 +
 fs/xfs/libxfs/xfs_health.h              |   47 +
 fs/xfs/libxfs/xfs_healthmon.schema.json |  595 +++++++++++++
 fs/xfs/xfs_aops.c                       |    2 
 fs/xfs/xfs_file.c                       |  167 ++++
 fs/xfs/xfs_file.h                       |   36 +
 fs/xfs/xfs_fsops.c                      |   57 +
 fs/xfs/xfs_fsops.h                      |   14 
 fs/xfs/xfs_health.c                     |  202 +++++
 fs/xfs/xfs_healthmon.c                  | 1372 +++++++++++++++++++++++++++++++
 fs/xfs/xfs_healthmon.h                  |  102 ++
 fs/xfs/xfs_ioctl.c                      |    7 
 fs/xfs/xfs_linux.h                      |    3 
 fs/xfs/xfs_mount.h                      |   13 
 fs/xfs/xfs_notify_failure.c             |  137 +++
 fs/xfs/xfs_notify_failure.h             |   44 +
 fs/xfs/xfs_super.c                      |   55 +
 fs/xfs/xfs_trace.c                      |    4 
 fs/xfs/xfs_trace.h                      |  369 ++++++++
 include/linux/fs.h                      |    4 
 include/linux/iomap.h                   |    2 
 26 files changed, 3301 insertions(+), 14 deletions(-)
 create mode 100644 fs/xfs/libxfs/xfs_healthmon.schema.json
 create mode 100644 fs/xfs/xfs_healthmon.c
 create mode 100644 fs/xfs/xfs_healthmon.h





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux