On Wed, Dec 15, 2021 at 03:12:37PM +0100, Lukas Czerner wrote: > On Wed, Dec 15, 2021 at 12:28:52PM +0100, Jan Kara wrote: > > On Tue 14-12-21 16:49:45, Darrick J. Wong wrote: > > > On Tue, Dec 14, 2021 at 05:50:58PM +0000, Luís Henriques wrote: > > > > When migrating to extents, the temporary inode will have it's own checksum > > > > seed. This means that, when swapping the inodes data, the inode checksums > > > > will be incorrect. > > > > > > > > This can be fixed by recalculating the extents checksums again. Or simply > > > > by copying the seed into the temporary inode. > > > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357 > > > > Reported-by: Jeroen van Wolffelaar <jeroen@xxxxxxxxxxxxx> > > > > Signed-off-by: Luís Henriques <lhenriques@xxxxxxx> > > > > --- > > > > fs/ext4/migrate.c | 12 +++++++++++- > > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > > > changes since v1: > > > > > > > > * Dropped tmp_ei variable > > > > * ->i_csum_seed is now initialised immediately after tmp_inode is created > > > > * New comment about the seed initialization and stating that recovery > > > > needs to be fixed. > > > > > > > > Cheers, > > > > -- > > > > Luís > > > > > > > > diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c > > > > index 7e0b4f81c6c0..36dfc88ce05b 100644 > > > > --- a/fs/ext4/migrate.c > > > > +++ b/fs/ext4/migrate.c > > > > @@ -459,6 +459,17 @@ int ext4_ext_migrate(struct inode *inode) > > > > ext4_journal_stop(handle); > > > > goto out_unlock; > > > > } > > > > + /* > > > > + * Use the correct seed for checksum (i.e. the seed from 'inode'). This > > > > + * is so that the metadata blocks will have the correct checksum after > > > > + * the migration. > > > > + * > > > > + * Note however that, if a crash occurs during the migration process, > > > > + * the recovery process is broken because the tmp_inode checksums will > > > > + * be wrong and the orphans cleanup will fail. > > > > > > ...and then what does the user do? > > > > Run fsck of course! And then recover from backups :) I know this is sad but > > the situation is that our migration code just is not crash-safe (if we > > crash we are going to free blocks that are still used by the migrated > > inode) and Luis makes it work in case we do not crash (which should be > > hopefully more common) and documents it does not work in case we crash. > > So overall I'd call it a win. > > > > But maybe we should just remove this online-migration functionality > > completely from the kernel? That would be also a fine solution for me. I > > was thinking whether we could somehow make the inode migration crash-safe > > but I didn't think of anything which would not require on-disk format > > change... > > Since this is not something that anyone can honestly recommend doing > without a prior backup and a word of warning I personaly would be in favor > of removing it. BTW, in case migration is kept in the kernel (even with the broken recovery), I think it's worth turning this bug reproducer into an ext4 fstest. I was planning to do so, but I'd rather wait to see if the effort is worthwhile (i.e. if migration is kept or not). Cheers, -- Luís