On Fri, 2022-12-30 at 14:10 -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Start the fourth chapter of the online fsck design documentation, > which > discusses the user interface and the background scrubbing service. > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- > .../filesystems/xfs-online-fsck-design.rst | 114 > ++++++++++++++++++++ > 1 file changed, 114 insertions(+) > > > diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst > b/Documentation/filesystems/xfs-online-fsck-design.rst > index d630b6bdbe4a..42e82971e036 100644 > --- a/Documentation/filesystems/xfs-online-fsck-design.rst > +++ b/Documentation/filesystems/xfs-online-fsck-design.rst > @@ -750,3 +750,117 @@ Proposed patchsets include `general stress > testing > < > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.g > it/log/?h=race-scrub-and-mount-state-changes>`_ > and the `evolution of existing per-function stress testing > < > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.g > it/log/?h=refactor-scrub-stress>`_. > + > +4. User Interface > +================= > + > +The primary user of online fsck is the system administrator, just > like offline > +repair. > +Online fsck presents two modes of operation to administrators: > +A foreground CLI process for online fsck on demand, and a background > service > +that performs autonomous checking and repair. > + > +Checking on Demand > +------------------ > + > +For administrators who want the absolute freshest information about > the > +metadata in a filesystem, ``xfs_scrub`` can be run as a foreground > process on > +a command line. > +The program checks every piece of metadata in the filesystem while > the > +administrator waits for the results to be reported, just like the > existing > +``xfs_repair`` tool. > +Both tools share a ``-n`` option to perform a read-only scan, and a > ``-v`` > +option to increase the verbosity of the information reported. > + > +A new feature of ``xfs_scrub`` is the ``-x`` option, which employs > the error > +correction capabilities of the hardware to check data file contents. > +The media scan is not enabled by default because it may dramatically > increase > +program runtime and consume a lot of bandwidth on older storage > hardware. > + > +The output of a foreground invocation is captured in the system log. > + > +The ``xfs_scrub_all`` program walks the list of mounted filesystems > and > +initiates ``xfs_scrub`` for each of them in parallel. > +It serializes scans for any filesystems that resolve to the same top > level > +kernel block device to prevent resource overconsumption. > + > +Background Service > +------------------ > + I'm assuming the below systemd services are configurable right? > +To reduce the workload of system administrators, the ``xfs_scrub`` > package > +provides a suite of `systemd <https://systemd.io/>`_ timers and > services that > +run online fsck automatically on weekends. by default. > +The background service configures scrub to run with as little > privilege as > +possible, the lowest CPU and IO priority, and in a CPU-constrained > single > +threaded mode. "This can be tuned at anytime to best suit the needs of the customer workload." Then I think you can drop the below line... > +It is hoped that this minimizes the amount of load generated on the > system and > +avoids starving regular workloads. > + > +The output of the background service is also captured in the system > log. > +If desired, reports of failures (either due to inconsistencies or > mere runtime > +errors) can be emailed automatically by setting the ``EMAIL_ADDR`` > environment > +variable in the following service files: > + > +* ``xfs_scrub_fail@.service`` > +* ``xfs_scrub_media_fail@.service`` > +* ``xfs_scrub_all_fail.service`` > + > +The decision to enable the background scan is left to the system > administrator. > +This can be done by enabling either of the following services: > + > +* ``xfs_scrub_all.timer`` on systemd systems > +* ``xfs_scrub_all.cron`` on non-systemd systems > + > +This automatic weekly scan is configured out of the box to perform > an > +additional media scan of all file data once per month. > +This is less foolproof than, say, storing file data block checksums, > but much > +more performant if application software provides its own integrity > checking, > +redundancy can be provided elsewhere above the filesystem, or the > storage > +device's integrity guarantees are deemed sufficient. > + > +The systemd unit file definitions have been subjected to a security > audit > +(as of systemd 249) to ensure that the xfs_scrub processes have as > little > +access to the rest of the system as possible. > +This was performed via ``systemd-analyze security``, after which > privileges > +were restricted to the minimum required, sandboxing was set up to > the maximal > +extent possible with sandboxing and system call filtering; and > access to the > +filesystem tree was restricted to the minimum needed to start the > program and > +access the filesystem being scanned. > +The service definition files restrict CPU usage to 80% of one CPU > core, and > +apply as nice of a priority to IO and CPU scheduling as possible. > +This measure was taken to minimize delays in the rest of the > filesystem. > +No such hardening has been performed for the cron job. > + > +Proposed patchset: > +`Enabling the xfs_scrub background service > +< > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.g > it/log/?h=scrub-media-scan-service>`_. > + > +Health Reporting > +---------------- > + > +XFS caches a summary of each filesystem's health status in memory. > +The information is updated whenever ``xfs_scrub`` is run, or > whenever > +inconsistencies are detected in the filesystem metadata during > regular > +operations. > +System administrators should use the ``health`` command of > ``xfs_spaceman`` to > +download this information into a human-readable format. > +If problems have been observed, the administrator can schedule a > reduced > +service window to run the online repair tool to correct the problem. > +Failing that, the administrator can decide to schedule a maintenance > window to > +run the traditional offline repair tool to correct the problem. > + > +**Question**: Should the health reporting integrate with the new > inotify fs > +error notification system? > + > +**Question**: Would it be helpful for sysadmins to have a daemon to > listen for > +corruption notifications and initiate a repair? > + > +*Answer*: These questions remain unanswered, but should be a part of > the > +conversation with early adopters and potential downstream users of > XFS. I think if there's been no commentary at this point then likely they can't be answered at this time. Perhaps for now it is reasonable to just let the be a potential improvement in the future if the demand for it arises. In any case, I think we should probably clean out the Q&A discussion prompts. Rest looks good tho Allison > + > +Proposed patchsets include > +`wiring up health reports to correction returns > +< > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/ > log/?h=corruption-health-reports>`_ > +and > +`preservation of sickness info during memory reclaim > +< > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/ > log/?h=indirect-health-reporting>`_. >