Re: ext4 backup super block update frequency

"Theodore Y. Ts'o" <tytso@xxxxxxx> · Fri, 31 Aug 2018 16:20:36 -0400

On Fri, Aug 31, 2018 at 12:40:33PM -0400, Shehbaz Jaffer wrote:
> 
> I am trying to study how backup superblocks are updated and used in
> the ext4 file system. I create a 2GB file system. This creates 5
> backup superblocks with the sparse sb option. I then check the diff of
> the backup blocks before and after multiple mount, write(2) fsync(2)
> and close(2) and unmount operations.
> 
> I can see the primary superblock get updated but I do not see the
> backup super blocks being updated. My intuition is that the backup
> blocks are only present so that the recovery can be done by replaying
> the journal on the backup superblock. Are they updated each time the
> journal gets full so that the "refreshed" journal can now be replayed
> on updated backup superblock in case of a crash?
> If this is incorrect, at what frequency do backup superblocks updated?

The backup superblocks don't get updated, because they are the backup
in case the primary superblock gets trashed.  One of the best ways to
avoid the sector getting trashed is to not update it.  :-)

In general, the backup superblocks will only get updated if some
aspect of the superblock has been updated that is fundamental about
parameters of the file system.  For example, if the file system is
resized, or tune2fs is used to change some configuration parameter of
the file system.

In ext4, the journal is located in a fixed location on disk, as a
circular buffer..  There is a separate journal superblock which points
at the beginning of the journal, and on a journal replay, we just keep
going until we find the last valid journal commit block.  When the
journal gets full, we do a checkpoint operation which allows us to
free space in the journal, and at that point, we move the the
beginning of the journal, and it's only then that we need to update
the journal superblock.

> If we compare this behavior with BtrFS, I can see that for each update
> on fs tree, the primary block (at offset 64KB) and backup superblock
> (at 64MB) gets updated.

The btrfs file system is different because it uses a copy-on-write
file system design.  That means that it has a series of B-trees where
the root block of the B-trees are constantly being updated, and *all*
of the file system's state is hanging off of the B-trees.  If you
can't find the root of the B-trees, you can't find *anything* in the
file system.` You'll have a bunch of B-tree nodes pointing at each
other, and there won't be any easy way to figure out where the most
recent root of the B-trees are.  For that reason, btrfs stores
information about where to find the roots of the B-trees in two
superblocks, so that a single I/O error won't cause the entire file
system to be lost.

In contrast, ext4's metadata is in a fixed location.  So it's much
easier for ext4's fsck to recover from a trashed superblock.  And as
far as the journal superblock is concerned, we only have to update it
when we update the beginning of the journal, as described previously.

Cheers,

					- Ted