Re: [PATCH 07/12] ext4: Defer saving error info from atomic context

"Theodore Y. Ts'o" <tytso@xxxxxxx> · Wed, 16 Dec 2020 00:31:09 -0500

On Fri, Nov 27, 2020 at 12:34:00PM +0100, Jan Kara wrote:
> When filesystem inconsistency is detected with group locked, we
> currently try to modify superblock to store error there without
> blocking. However this can cause superblock checksum failures (or
> DIF/DIX failure) when the superblock is just being written out.
> 
> Make error handling code just store error information in ext4_sb_info
> structure and copy it to on-disk superblock only in ext4_commit_super().
> In case of error happening with group locked, we just postpone the
> superblock flushing to a workqueue.
> 
> Signed-off-by: Jan Kara <jack@xxxxxxx>

So this patch does make a behavioral change, which is if a file system
contains errors when it is mounted, when the kernel trips over more
file system problems, the first error is overwritten.

Before, s_first_error_* used to mean the first error found since the
file system was checked.  With this patch, s_first_error_* now means
the first error found since the file system was mounted.

This distinction is critical, because there are some buggy distro's
(Ubuntu and Debian) out there where their cloud image does *not* run
fsck on boot.  So if a file system is corrupted, it does not get fixed
up, and file system can get more and more damaged.  In that case, it's
good to know when the file system was first damaged, even if it was
six months earlier and several reboots and remounts later.   :-/

We should be able to fix up this patch by making commit_super only
update the on-disk s_first_error* fields if s_first_error_time on disk
is 0.

					- Ted

P.S.  Bugs have been filed with both distro's.  Ubuntu has accepted it
is a bug, but we're still working on convincing the Debian cloud image
devs....

And it's not just the buggy cloud iamges; it's also happened on
occasion that sloppy embedded Linux developers or sysadmins have
misconfigured their system so that fsck never gets run, and it's nice
to be able to have the forensic data preserved.