On Wed, Jan 11, 2023 at 03:39:08PM -0800, Darrick J. Wong wrote: > On Wed, Jan 11, 2023 at 01:25:12AM +0000, Allison Henderson wrote: > > On Fri, 2022-12-30 at 14:10 -0800, Darrick J. Wong wrote: > > > +Primary metadata objects are the simplest for scrub to process. > > > +The principal filesystem object (either an allocation group or an > > > inode) that > > > +owns the item being scrubbed is locked to guard against concurrent > > > updates. > > > +The check function examines every record associated with the type > > > for obvious > > > +errors and cross-references healthy records against other metadata > > > to look for > > > +inconsistencies. > > > +Repairs for this class of scrub item are simple, since the repair > > > function > > > +starts by holding all the resources acquired in the previous step. > > > +The repair function scans available metadata as needed to record all > > > the > > > +observations needed to complete the structure. > > > +Next, it stages the observations in a new ondisk structure and > > > commits it > > > +atomically to complete the repair. > > > +Finally, the storage from the old data structure are carefully > > > reaped. > > > + > > > +Because ``xfs_scrub`` locks a primary object for the duration of the > > > repair, > > > +this is effectively an offline repair operation performed on a > > > subset of the > > > +filesystem. > > > +This minimizes the complexity of the repair code because it is not > > > necessary to > > > +handle concurrent updates from other threads, nor is it necessary to > > > access > > > +any other part of the filesystem. > > > +As a result, indexed structures can be rebuilt very quickly, and > > > programs > > > +trying to access the damaged structure will be blocked until repairs > > > complete. > > > +The only infrastructure needed by the repair code are the staging > > > area for > > > +observations and a means to write new structures to disk. > > > +Despite these limitations, the advantage that online repair holds is > > > clear: > > > +targeted work on individual shards of the filesystem avoids total > > > loss of > > > +service. > > > + > > > +This mechanism is described in section 2.1 ("Off-Line Algorithm") of > > > +V. Srinivasan and M. J. Carey, `"Performance of On-Line Index > > > Construction > > > +Algorithms" <https://dl.acm.org/doi/10.5555/645336.649870>`_, > > Hmm, this article is not displaying for me. If the link is abandoned, > > probably there's not much need to keep it around > > The actual paper is not directly available through that ACM link, but > the DOI is what I used to track down a paper copy(!) of that paper as > published in a journal. PDF version here: https://minds.wisconsin.edu/bitstream/handle/1793/59524/TR1047.pdf?sequence=1 -Dave. -- Dave Chinner david@xxxxxxxxxxxxx