Re: [PATCH -next] ext4: fix super block checksum incorrect after mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22/05/25 01:54PM, Jan Kara wrote:
> On Wed 25-05-22 13:21:23, Ritesh Harjani wrote:
> > On 22/05/25 09:29AM, Ye Bin wrote:
> > > We got issue as follows:
> > > [home]# mount  /dev/sda  test
> > > EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
> > > [home]# dmesg
> > > EXT4-fs (sda): warning: mounting fs with errors, running e2fsck is recommended
> > > EXT4-fs (sda): Errors on filesystem, clearing orphan list.
> > > EXT4-fs (sda): recovery complete
> > > EXT4-fs (sda): mounted filesystem with ordered data mode. Quota mode: none.
> > > [home]# debugfs /dev/sda
> > > debugfs 1.46.5 (30-Dec-2021)
> > > Checksum errors in superblock!  Retrying...
> > >
> > > Reason is ext4_orphan_cleanup will reset ‘s_last_orphan’ but not update
> > > super block checksum.
> > > To solve above issue, defer update super block checksum after ext4_orphan_cleanup.
> >
> > I agree with the analysis. However after [1], I think all updates to superblock
> > (including checksum computation) should be done within buffer lock.
> > (lock_buffer(), unlock_buffer()).
> >
> > [1]: https://lore.kernel.org/all/20201216101844.22917-4-jack@xxxxxxx/
>
> So technically you're right that we should hold buffer lock all the time
> from before we modify superblock buffer until we recompute the checksum (so
> that we avoid writing superblock with mismatched checksum). To do this we'd
> have to put checksum recomputations and superblock buffer locking into
> ext4_orphan_cleanup() around setting of es->s_last_orphan (in three places
> there AFAICS). A bit tedious but it would actually also fix a (theoretical)
> race that someone decides to write out superblock after we set
> s_last_orphan but before we set the checksum.

Ok. Although (I think) it can still be done at just one place before returning
from ext4_orphan_cleanup().
But I agree it is mostly a theoretical race (in fact since this is happening
during mount, I am not sure if it is even possible?) and there might not
be any value addition in doing so by complicating it too much.

>
> Overall I'm not convinced this is really necessary so I'd be OK even with
> what Ye suggested. That is IMHO better than mostly pointless locking just
> around checksum computation because that just makes reader wonder why is it
> needed...

Sure, yes. Thanks for explaining it.

-ritesh

>
> 								Honza
>
> >
> > With lock changes added, feel free to add -
> >
> > Reviewed-by: Ritesh Harjani <ritesh.list@xxxxxxxxx>
> >
> >
> > >
> > >
> > > Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx>
> > > ---
> > >  fs/ext4/super.c | 16 ++++++++--------
> > >  1 file changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > > index f9a3ad683b4a..c47204029429 100644
> > > --- a/fs/ext4/super.c
> > > +++ b/fs/ext4/super.c
> > > @@ -5300,14 +5300,6 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
> > >  		err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
> > >  					  GFP_KERNEL);
> > >  	}
> > > -	/*
> > > -	 * Update the checksum after updating free space/inode
> > > -	 * counters.  Otherwise the superblock can have an incorrect
> > > -	 * checksum in the buffer cache until it is written out and
> > > -	 * e2fsprogs programs trying to open a file system immediately
> > > -	 * after it is mounted can fail.
> > > -	 */
> > > -	ext4_superblock_csum_set(sb);
> > >  	if (!err)
> > >  		err = percpu_counter_init(&sbi->s_dirs_counter,
> > >  					  ext4_count_dirs(sb), GFP_KERNEL);
> > > @@ -5365,6 +5357,14 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
> > >  	EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
> > >  	ext4_orphan_cleanup(sb, es);
> > >  	EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
> > > +	/*
> > > +	 * Update the checksum after updating free space/inode counters and
> > > +	 * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
> > > +	 * checksum in the buffer cache until it is written out and
> > > +	 * e2fsprogs programs trying to open a file system immediately
> > > +	 * after it is mounted can fail.
> > > +	 */
> > > +	ext4_superblock_csum_set(sb);
> > >  	if (needs_recovery) {
> > >  		ext4_msg(sb, KERN_INFO, "recovery complete");
> > >  		err = ext4_mark_recovery_complete(sb, es);
> > > --
> > > 2.31.1
> > >
> --
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux