On Tue, Feb 28, 2023 at 10:49 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > Hello fsdevel people, > > Five years ago[0], we started a conversation about cross-filesystem > userspace tooling for online fsck. I think enough time has passed for > us to have another one, since a few things have happened since then: > > 1. ext4 has gained the ability to send corruption reports to a userspace > monitoring program via fsnotify. Thanks, Collabora! > > 2. XFS now tracks successful scrubs and corruptions seen during runtime > and during scrubs. Userspace can query this information. > > 3. Directory parent pointers, which enable online repair of the > directory tree, is nearing completion. > > 4. Dave and I are working on merging online repair of space metadata for > XFS. Online repair of directory trees is feature complete, but we > still have one or two unresolved questions in the parent pointer > code. > > 5. I've gotten a bit better[1] at writing systemd service descriptions > for scheduling and performing background online fsck. > > Now that fsnotify_sb_error exists as a result of (1), I think we > should figure out how to plumb calls into the readahead and writeback > code so that IO failures can be reported to the fsnotify monitor. I > suspect there may be a few difficulties here since fsnotify (iirc) > allocates memory and takes locks. > > As a result of (2), XFS now retains quite a bit of incore state about > its own health. The structure that fsnotify gives to userspace is very > generic (superblock, inode, errno, errno count). How might XFS export > a greater amount of information via this interface? We can provide > details at finer granularity -- for example, a specific data structure > under an allocation group or an inode, or specific quota records. > > With (4) on the way, I can envision wanting a system service that would > watch for these fsnotify events, and transform the error reports into > targeted repair calls in the kernel. This of course would be very > filesystem specific, but I would also like to hear from anyone pondering > other usecases for fsnotify filesystem error monitors. > > Once (3) lands, XFS gains the ability to translate a block device IO > error to an inode number and file offset, and then the inode number to a > path. In other words, your file breaks and now we can tell applications > which file it was so they can failover or redownload it or whatever. > Ric Wheeler mentioned this in 2018's session. > > The final topic from that 2018 session concerned generic wrappers for > fsscrub. I haven't pushed hard on that topic because XFS hasn't had > much to show for that. Now that I'm better versed in systemd services, > I envision three ways to interact with online fsck: > > - A CLI program that can be run by anyone. > > - Background systemd services that fire up periodically. > > - A dbus service that programs can bind to and request a fsck. > > I still think there's an opportunity to standardize the naming to make > it easier to use a variety of filesystems. I propose for the CLI: > > /usr/sbin/fsscrub $mnt that calls /usr/sbin/fsscrub.$FSTYP $mnt > > For systemd services, I propose "fsscrub@<escaped mountpoint>". I > suspect we want a separate background service that itself runs > periodically and invokes the fsscrub@$mnt services. xfsprogs already > has a xfs_scrub_all service that does that. The services are nifty > because it's really easy to restrict privileges, implement resource > usage controls, and use private name/mountspaces to isolate the process > from the rest of the system. > > dbus is a bit trickier, since there's no precedent at all. I guess > we'd have to define an interface for filesystem "object". Then we could > write a service that establishes a well-known bus name and maintains > object paths for each mounted filesystem. Each of those objects would > export the filesystem interface, and that's how programs would call > online fsck as a service. > > Ok, that's enough for a single session topic. Thoughts? :) Darrick, Quick question. You indicated that you would like to discuss the topics: Atomic file contents exchange Atomic directio writes Are those intended to be in a separate session from online fsck? Both in the same session? I know you posted patches for FIEXCHANGE_RANGE [1], but they were hiding inside a huge DELUGE and people were on New Years holidays, so nobody commented. Perhaps you should consider posting an uptodate topic suggestion to let people have an opportunity to start a discussion before LSFMM. Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/167243843494.699466.5163281976943635014.stgit@magnolia/