Re: [PATCH 02/14] xfs: document the general theory underlying online fsck design

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 11, 2023 at 03:39:08PM -0800, Darrick J. Wong wrote:
> On Wed, Jan 11, 2023 at 01:25:12AM +0000, Allison Henderson wrote:
> > On Fri, 2022-12-30 at 14:10 -0800, Darrick J. Wong wrote:
> > > +Primary metadata objects are the simplest for scrub to process.
> > > +The principal filesystem object (either an allocation group or an
> > > inode) that
> > > +owns the item being scrubbed is locked to guard against concurrent
> > > updates.
> > > +The check function examines every record associated with the type
> > > for obvious
> > > +errors and cross-references healthy records against other metadata
> > > to look for
> > > +inconsistencies.
> > > +Repairs for this class of scrub item are simple, since the repair
> > > function
> > > +starts by holding all the resources acquired in the previous step.
> > > +The repair function scans available metadata as needed to record all
> > > the
> > > +observations needed to complete the structure.
> > > +Next, it stages the observations in a new ondisk structure and
> > > commits it
> > > +atomically to complete the repair.
> > > +Finally, the storage from the old data structure are carefully
> > > reaped.
> > > +
> > > +Because ``xfs_scrub`` locks a primary object for the duration of the
> > > repair,
> > > +this is effectively an offline repair operation performed on a
> > > subset of the
> > > +filesystem.
> > > +This minimizes the complexity of the repair code because it is not
> > > necessary to
> > > +handle concurrent updates from other threads, nor is it necessary to
> > > access
> > > +any other part of the filesystem.
> > > +As a result, indexed structures can be rebuilt very quickly, and
> > > programs
> > > +trying to access the damaged structure will be blocked until repairs
> > > complete.
> > > +The only infrastructure needed by the repair code are the staging
> > > area for
> > > +observations and a means to write new structures to disk.
> > > +Despite these limitations, the advantage that online repair holds is
> > > clear:
> > > +targeted work on individual shards of the filesystem avoids total
> > > loss of
> > > +service.
> > > +
> > > +This mechanism is described in section 2.1 ("Off-Line Algorithm") of
> > > +V. Srinivasan and M. J. Carey, `"Performance of On-Line Index
> > > Construction
> > > +Algorithms" <https://dl.acm.org/doi/10.5555/645336.649870>`_,
> > Hmm, this article is not displaying for me.  If the link is abandoned,
> > probably there's not much need to keep it around
> 
> The actual paper is not directly available through that ACM link, but
> the DOI is what I used to track down a paper copy(!) of that paper as
> published in a journal.

PDF version here:

https://minds.wisconsin.edu/bitstream/handle/1793/59524/TR1047.pdf?sequence=1

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux