On Sun, Apr 16, 2023 at 11:11 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > > On 2023/3/1 04:49, Darrick J. Wong wrote: > > Hello fsdevel people, > > > > Five years ago[0], we started a conversation about cross-filesystem > > userspace tooling for online fsck. I think enough time has passed for > > us to have another one, since a few things have happened since then: > > > > 1. ext4 has gained the ability to send corruption reports to a userspace > > monitoring program via fsnotify. Thanks, Collabora! > > Not familiar with the new fsnotify thing, any article to start? https://docs.kernel.org/admin-guide/filesystem-monitoring.html#file-system-error-reporting fs needs to opt-in with fsnotify_sb_error() calls and currently, only ext4 does that. > > I really believe we should have a generic interface to report errors, > currently btrfs reports extra details just through dmesg (like the > logical/physical of the corruption, reason, involved inodes etc), which > is far from ideal. > > > > > 2. XFS now tracks successful scrubs and corruptions seen during runtime > > and during scrubs. Userspace can query this information. > > > > 3. Directory parent pointers, which enable online repair of the > > directory tree, is nearing completion. > > > > 4. Dave and I are working on merging online repair of space metadata for > > XFS. Online repair of directory trees is feature complete, but we > > still have one or two unresolved questions in the parent pointer > > code. > > > > 5. I've gotten a bit better[1] at writing systemd service descriptions > > for scheduling and performing background online fsck. > > > > Now that fsnotify_sb_error exists as a result of (1), I think we > > should figure out how to plumb calls into the readahead and writeback > > code so that IO failures can be reported to the fsnotify monitor. I > > suspect there may be a few difficulties here since fsnotify (iirc) > > allocates memory and takes locks. > > > > As a result of (2), XFS now retains quite a bit of incore state about > > its own health. The structure that fsnotify gives to userspace is very > > generic (superblock, inode, errno, errno count). How might XFS export > > a greater amount of information via this interface? We can provide > > details at finer granularity -- for example, a specific data structure > > under an allocation group or an inode, or specific quota records. > > The same for btrfs. > > Some btrfs specific info like subvolume id is also needed to locate the > corrupted inode (ino is not unique among the full fs, but only inside > one subvolume). > The fanotify error event (which btrfs does not currently generate) contains an "FID record", FID is fsid+file_handle. For btrfs, file_handle would be FILEID_BTRFS_WITHOUT_PARENT so include the subvol root ino. > And something like file paths for the corrupted inode is also very > helpful for end users to locate (and normally delete) the offending inode. > This interface was merged without the ability to report an fs-specific info blob, but it was designed in a way that would allow adding that blob. Thanks, Amir.