On Thu, Aug 11, 2022 at 10:20:12AM +1000, Dave Chinner wrote: > On Sun, Aug 07, 2022 at 11:30:28AM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Start the fourth chapter of the online fsck design documentation, which > > discusses the user interface and the background scrubbing service. > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > --- > > .../filesystems/xfs-online-fsck-design.rst | 114 ++++++++++++++++++++ > > 1 file changed, 114 insertions(+) > > > > > > diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs-online-fsck-design.rst > > index d630b6bdbe4a..42e82971e036 100644 > > --- a/Documentation/filesystems/xfs-online-fsck-design.rst > > +++ b/Documentation/filesystems/xfs-online-fsck-design.rst > > @@ -750,3 +750,117 @@ Proposed patchsets include `general stress testing > > <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=race-scrub-and-mount-state-changes>`_ > > and the `evolution of existing per-function stress testing > > <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=refactor-scrub-stress>`_. > > + > > +4. User Interface > > +================= > > + > > +The primary user of online fsck is the system administrator, just like offline > > +repair. > > +Online fsck presents two modes of operation to administrators: > > +A foreground CLI process for online fsck on demand, and a background service > > +that performs autonomous checking and repair. > > + > > +Checking on Demand > > +------------------ > > + > > +For administrators who want the absolute freshest information about the > > +metadata in a filesystem, ``xfs_scrub`` can be run as a foreground process on > > +a command line. > > +The program checks every piece of metadata in the filesystem while the > > +administrator waits for the results to be reported, just like the existing > > +``xfs_repair`` tool. > > +Both tools share a ``-n`` option to perform a read-only scan, and a ``-v`` > > +option to increase the verbosity of the information reported. > > + > > +A new feature of ``xfs_scrub`` is the ``-x`` option, which employs the error > > +correction capabilities of the hardware to check data file contents. > > +The media scan is not enabled by default because it may dramatically increase > > +program runtime and consume a lot of bandwidth on older storage hardware. > > So '-x' runs a media scrub command? What does that do with software > RAID? Nothing special unless the RAID controller itself does parity checking of reads -- the kernel doesn't have any API calls (that I know of) to do that. I think md-raid5 will check the parity, but afaict nothing else (raid1) does that. > Does that trigger parity checks of the RAID volume, or pass > through to the underlying hardware to do physical media scrub? Chaitanya proposed a userspace api so that xfs_scrub could actually ask the hardware to perform a media verification[1], but willy pointed out that it none of the device protocols have a means for the device to prove that it did anything, so it stalled. [1] https://lore.kernel.org/linux-fsdevel/20220713072019.5885-1-kch@xxxxxxxxxx/ > Or maybe both? I wish. :) > Rewriting the paragraph to be focussed around the functionality > being provided (i.e "media scrubbing is a new feature of xfs_scrub. > It provides .....") Er.. you're doing that, or asking me to do it? > > +The output of a foreground invocation is captured in the system log. > > At what log level? That depends on the message, but right now it only uses LOG_{ERR,WARNING,INFO}. Errors, corruptions, and unfixable problems are LOG_ERR. Warnings are LOG_WARNING. Notices of infomration, repairs completed, and optimizations made are all recorded with LOG_INFO. > > +The ``xfs_scrub_all`` program walks the list of mounted filesystems and > > +initiates ``xfs_scrub`` for each of them in parallel. > > +It serializes scans for any filesystems that resolve to the same top level > > +kernel block device to prevent resource overconsumption. > > Is this serialisation necessary for non-HDD devices? That ultimately depends on the preferences of the sysadmins, but for the initial push I'd rather err on the side of using fewer iops on a running system. > > +Background Service > > +------------------ > > + > > +To reduce the workload of system administrators, the ``xfs_scrub`` package > > +provides a suite of `systemd <https://systemd.io/>`_ timers and services that > > +run online fsck automatically on weekends. > > Weekends change depending on where you are in the world, right? So > maybe this should be more explicit? Sunday at 3:10am, whenever that is in the local time zone. > [....] > > > +**Question**: Should the health reporting integrate with the new inotify fs > > +error notification system? > > Can the new inotify fs error notification system report complex > health information structures? In theory, yes, said the authors. > How much pain is involved in making > it do what we want, considering we already have a health reporting > ioctl that can be polled? I haven't tried this myself, but I think it involves defining a new type code and message length within the inotify system. The last time I looked at the netlink protocol, I /think/ I saw that it's the case that the consuming programs will read the header, see that there's a type code and a buffer length, and decide to use it or skip it. That said, there were some size and GFP_ limits on what could be sent, so I don't know how difficult it would be to make this part actually work in practice. Gabriel said it wouldn't be too difficult once I was ready. > > +**Question**: Would it be helpful for sysadmins to have a daemon to listen for > > +corruption notifications and initiate a repair? > > Seems like an obvious extension to the online repair capability. ...too bad there are dragons thataways. --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx