Re: ext4 data corruption in 6.1 stable tree (was Re: [PATCH 5.15 000/297] 5.15.140-rc1 review)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 11-12-23 09:28:40, Pavel Machek wrote:
> > > > So I've got back to this and the failure is a subtle interaction between
> > > > iomap code and ext4 code. In particular that fact that commit 936e114a245b6
> > > > ("iomap: update ki_pos a little later in iomap_dio_complete") is not in
> > > > stable causes that file position is not updated after direct IO write and
> > > > thus we direct IO writes are ending in wrong locations effectively
> > > > corrupting data. The subtle detail is that before this commit if ->end_io
> > > > handler returns non-zero value (which the new ext4 ->end_io handler does),
> > > > file pos doesn't get updated, after this commit it doesn't get updated only
> > > > if the return value is < 0.
> > > > 
> > > > The commit got merged in 6.5-rc1 so all stable kernels that have
> > > > 91562895f803 ("ext4: properly sync file size update after O_SYNC direct
> > > > IO") before 6.5 are corrupting data - I've noticed at least 6.1 is still
> > > > carrying the problematic commit. Greg, please take out the commit from all
> > > > stable kernels before 6.5 as soon as possible, we'll figure out proper
> > > > backport once user data are not being corrupted anymore. Thanks!
> > > > 
> > > 
> > > Thanks a lot for the update.
> > > 
> > > Turns out this is causing a regression in chromeos-6.1, and reverting the
> > > offending patch fixes the problem. I suspect anyone running v6.1.64+ may
> > > have a problem.
> > 
> > Jan, thanks for the report, and Guenter, thanks for letting me know as
> > well.  I'll go queue up the fix now and push out new -rc releases.
> 
> Would someone have a brief summary here? I see 6.1.66 is out but I
> don't see any "Fixes: 91562895f803" tags.
> 
> Plus, what is the severity of this? It is "data being corrupted when
> using O_SYNC|O_DIRECT" or does metadata somehow get corrupted, too?

It is pure data corruption happening for ext4 direct IO writes because they
do not properly update current file position after the write.

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux