Here's some answers to your questions: On Sun, Mar 6, 2022 at 3:57 AM Arnaud M <arnaud.meauzoone@xxxxxxxxx> wrote: > Hello to everyone :) > > Just some question about filesystem scrubbing > > In this documentation it is said that scrub will help admin check > consistency of filesystem: > > https://docs.ceph.com/en/latest/cephfs/scrub/ > > So my questions are: > > Is filesystem scrubbing mandatory ? > How often should I scrub the whole filesystem (ie start at /) > How often should I scrub ~mdsdir > Should I set up a cronjob ? > Is filesystem scrubbing considerated armless ? Even with recursive force > repair ? > Is there any chance for scrubbing to overload mds on a big file system (ie > like find . -ls) ? > What is the difference between "recursive repair" and "recursive force > repair" ? Is "force" armless ? > Is there any way to see at which file/folder is the scrub operation ? In > fact any better way to see srub progress than "scrub status" which doesn't > say much > > Sorry for all the questions, but there is not that much documentation about > filesystem scrubbing. And I do think the answers will help a lot of cephfs > administrators :) > > Thanks to all > > All the best > > Arnaud > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > 1. Is filesystem scrubbing mandatory ? As a routine system administration practice, it is good to ensure that your file-system is always in a good state. To avoid getting the file-system into a bottleneck state during work hours, it's always a good idea to reserve some time to run a recursive forward scrub and use the in-built scrub automation to fix such issues. Although you can run the scrub at any directory of your choice, it's always a good practice to start the scrub at the file-system root once in a while. So file-system scrubbing is not mandatory but highly recommended. Filesystem scrubbing is designed to read CephFS’ metadata and detect inconsistencies or issues that are generated by bitrot or bugs, just as RADOS’ pg scrubbing is. In a perfect world without bugs or bit flips it would be unnecessary, but we don’t live in that world — so a scrub can detect small issues before they turn into big ones, and the mere act of reading data can keep it fresh and give storage devices a chance to correct any media errors while that’s still possible. We don’t have a specific recommended schedule and scrub takes up cluster IO and compute resources so its frequency should be tailored to your workload. 1. How often should I scrub the whole filesystem (ie start at /) Since you'd always want to have a consistent file-system, it would good to run scrubbing: 1. before taking a snapshot of the entire file-system OR 2. before taking a backup of the entire file-system OR 3. after significant metadata activity eg. after creating files, renaming files, deleting files, changing file attributes, etc. There's no one-rule-fixes-all scenario. So, you'll need to follow a heuristic approach. The type of devices (HDD or SSD), the amount of activity wearing the device are the typical factors involved when deciding to scrub a file-system. If you have some window dedicated for backup activity, then you’d want to run a recursive forward scrub with repair on the entire file-system before it is snapshotted and used for backup. Although you can run a scrub along with active use of the file-system, it is always recommended that you run the scrub on a quiet file-system so that neither of the activities get in each other’s way. This also helps in completing the scrub task quicker. 1. How often should I scrub ~mdsdir ? ~mdsdir is used to collect deleted (stray) entries. So, the number of file/dir unlinks in a typical workload should be used to come up with a heuristic to scrub the file-system. This activity can be taken up separately from scrubbing the file-system root. 1. Should I set up a cron job ? Yes, you could. 1. Is filesystem scrubbing considered harmless ? Even with recursive force repair ? Yes, scrubbing even with repair is harmless. Scrubbing with repair does the following things: 1. Repair backtrace If on-disk and in-memory backtraces don't match, then the DIRTYPARENT flag is set so that the journal logger thread picks the inode for writing the backtrace to the disk. 2. Repair inode If on-disk and in-memory inode versions don't match, then the inode is left untouched. Otherwise, if the inode is marked as "free", the inode number is removed from active use. 3. Repair recursive-stats If on-disk and in-memory raw-stats don't match, then all the stats for the leaves in the directory tree are marked dirty and a scatter-gather operation is forced to coalesce raw-stats info. 1. Is there any chance for scrubbing to overload mds on a big file system ie. like find . -ls ? Scrubbing on its own should not be able to overload an MDS, but it is an additional load on top of whatever client activity the MDS is serving, which could exceed the server’s capacity. To put it in short, yes, it might overload the mds when done in sustained high I/O scenarios. The mds config option mds_max_scrub_ops_in_progress, which defaults to 5, decides the number of scrubs running at any given time. So, there is a small effort at throttling. 1. What is the difference between "recursive repair" and "recursive force repair" ? Is "force" harmless ? If “force” argument is specified, then a dirfrag is scrubbed only if 1. The dentry version is greater than last scrub version AND 2. The dentry type is a DIR If “force” is not specified, then dirfrag scrubbing is skipped. You will be able to see an mds log saying that the scrubbing is skipped for the dentry. The rest of the scrubbing is done as described in Q5 above. 1. Is there any way to see at which file/folder is the scrub operation ? In fact any better way to see scrub progress than "scrub status" which doesn't say much. Currently there's no way to see which file/folder is being scrubbed. At most we could log a line in the mds logs about it, but it could soon cause logs to bloat if the number of entries are large. -- Milind _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx