Re: [PATCH] ext4: Fix block validation on non-journal fs in __ext4_iget()

"Theodore Ts'o" <tytso@xxxxxxx> · Fri, 13 May 2022 23:37:40 -0400

On Thu, Apr 21, 2022 at 03:23:12AM +0800, Nguyen Dinh Phi wrote:
> Syzbot report following KERNEL BUG:
> 	kernel BUG at fs/ext4/extents_status.c:899!
> 	....
> 
> The reason is fast commit recovery path will skip block validation in
> __ext4_iget(), it allows syzbot be able to mount a corrupted non-journal
> filesystem and cause kernel BUG when accessing it.
> 
> Fix it by adding a condition checking.

> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 560e56b42829..66c86d85081e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4951,7 +4951,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
>  		goto bad_inode;
>  	} else if (!ext4_has_inline_data(inode)) {
>  		/* validate the block references in the inode */
> -		if (!(EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY) &&
> +		if (!(journal && EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY) &&

This isn't the right fix.  It papers over the problem and fixes the
specific syzkaller fuzzed image, but there are other corrupted file
system images which will cause problems.

What the syzkaller fuzzed file system image did was to set the
EXT4_FC_REPLAY_BIT bit the on_disk superblock field s_state, which
then gets copied to sbi->s_mount_state:

	sbi->s_mount_state = le16_to_cpu(es->s_state);

... and then hilarity ensues.

The root cause is that we are using EXT4_FC_REPLAY bit in
sbi->s_mount_state to indicate whether we are in the middle of a fast
commit replay.  This *should* have been done using a bit in
s_mount_flags (e.g., EXT4_MF_FC_REPLAY) via the
ext4_{set,clear,test}_mount_flag() inline functions.

The previous paragraph describes the correct long-term fix, but the
trivial/hacky fix which is easy to backport to LTS stable kernels is
something like this:

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4b0ea8df1f5c..f7ae53d986f1 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4889,7 +4889,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 					sbi->s_inodes_per_block;
 	sbi->s_desc_per_block = blocksize / EXT4_DESC_SIZE(sb);
 	sbi->s_sbh = bh;
-	sbi->s_mount_state = le16_to_cpu(es->s_state);
+	sbi->s_mount_state = le16_to_cpu(es->s_state) & ~EXT4_FC_REPLAY;
 	sbi->s_addr_per_block_bits = ilog2(EXT4_ADDR_PER_BLOCK(sb));
 	sbi->s_desc_per_block_bits = ilog2(EXT4_DESC_PER_BLOCK(sb));
 
@@ -6452,7 +6452,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
 				if (err)
 					goto restore_opts;
 			}
-			sbi->s_mount_state = le16_to_cpu(es->s_state);
+			sbi->s_mount_state = (le16_to_cpu(es->s_state) &
+					      ~EXT4_FC_REPLAY);
 
 			err = ext4_setup_super(sb, es, 0);
 			if (err)

(The first hunk is sufficient to suppress the syzkaller failure, but
for completeness sake we need catch the case where the journal
contains a maliciously modified superblock, which then is copied to
the active superblock, after which hilarity once again ensues.)

    	   	       	     	   	    	 - Ted