On Fri, Nov 15, 2024 at 07:49:26PM +1030, Qu Wenruo wrote: > [BUG] > When developing sector size < page size handling for btrfs, I'm hitting > a data corruption, which is only possible with the following out-of-tree > patches: > > btrfs: allow inline data extents creation if sector size < page size > btrfs: allow buffered write to skip full page if it's sector aligned > > [CAUSE] > Thankfully no upstream kernels are affected, even if some one is > mounting a btrfs created by x86_64 with inlined data extents, they won't > hit the corruption. > > The root cause is that when reading inline extents, we zero out the > whole remaining range until folio end. > > This means such zeroing out can cover ranges that is dirtied but not yet > written back, thus lead to data corruption. > > This needs all the following conditions to be met: > > - Sector size < page size > So no x86_64 is affected. The most common users should be Asahi Linux. > But they are safe due to the next two conditions. > > - Inline data extents are present > For sector size < page size cases, we do not allow creating new inline > data extents but only reading it. > > But even all above cases are met by using a x86_64 created btrfs with > inlined data extents, the next point will still save us. > > - Partial uptodate folios are allowed > This requires the out-of-tree patch "btrfs: allow buffered write to skip > full page if it's sector aligned", or buffered write will read out the > whole folio before dirting any range. > > So end users are completely safe. > > [TEST CASE] > The test case itself is pretty straightforward: > > - Buffered write [0, 4k) > - Drop all page cache > - Buffered write [8k, 12k) > - Verify the file content > > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > --- > For anyone who wants to verify the failure, please fetch the following > branch, and reset to commit 4df35fbb829dfbcf64a914e5c8f652d9a3ad5227 > ("btrfs: allow inline data extents creation if sector size < page > size"). > > https://github.com/adam900710/linux.git subpage > > The top commit e7338d321bdf48e3b503c40e8eca7d7592709c83 > ("btrfs: fix inline data extent reads which zero out the remaining part") is the fix. > --- > tests/btrfs/327 | 58 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/327.out | 2 ++ > 2 files changed, 60 insertions(+) > create mode 100755 tests/btrfs/327 > create mode 100644 tests/btrfs/327.out > > diff --git a/tests/btrfs/327 b/tests/btrfs/327 > new file mode 100755 > index 00000000..72269fc7 > --- /dev/null > +++ b/tests/btrfs/327 > @@ -0,0 +1,58 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 327 > +# > +# Make sure reading inlined extents doesn't cause any corruption. > +# > +# This is a preventive test case inspired by btrfs/149, which can cause > +# data corruption when the following out-of-tree patches are applied and > +# the sector size is smaller than page size: > +# > +# btrfs: allow inline data extents creation if sector size < page size > +# btrfs: allow buffered write to skip full page if it's sector aligned > +# > +# Thankfully no upstream kernel is affected. > + > +. ./common/preamble > +_begin_fstest auto quick compress > + > +_require_scratch > + > +# We need 4K sector size support, especially this case can only be triggered > +# with sector size < page size for now. > +# > +# We do not check the page size and not_run so far, as in the long term btrfs > +# will support larger folios, then in that future 4K page size should be enough > +# to trigger the bug. > +_require_btrfs_support_sectorsize 4096 > + > +_scratch_mkfs >>$seqres.full 2>&1 > +_scratch_mount "-o compress,max_inline=4095" > + > +# Create one inlined data extent, only when using compression we can > +# create an inlined data extent up to sectorsize. > +# And for sector size < page size cases, we need the out-of-tree patch > +# "btrfs: allow inline data extents creation if sector size < page size" to > +# create such extent. > +xfs_io -f -c "pwrite 0 4k" "$SCRATCH_MNT/foobar" > /dev/null $XFS_IO_PROG > + > +# Drop the cache, we need to read out the above inlined data extent. > +echo 3 > /proc/sys/vm/drop_caches > + > +# Write into range [8k, 12k), with the out-of-tree patch "btrfs: allow > +# buffered write to skip full page if it's sector aligned", such write will not > +# trigger the read on the folio. > +xfs_io -f -c "pwrite 8k 4k" "$SCRATCH_MNT/foobar" > /dev/null $XFS_IO_PROG > + > +# Verify the checksum, for the affected devel branch, the read of inline extent > +# will zero out all the remaining range of the folio, screwing up the content > +# at [8K, 12k). > +_md5_checksum "$SCRATCH_MNT/foobar" > + > +_scratch_unmount This's not needed if it's not a necessary test step. Others look good to me, if no more review points from btrfs list, I'll merge this patch with above changes. Reviewed-by: Zorro Lang <zlang@xxxxxxxxxx> > + > +# success, all done > +status=0 > +exit > diff --git a/tests/btrfs/327.out b/tests/btrfs/327.out > new file mode 100644 > index 00000000..aebf8c72 > --- /dev/null > +++ b/tests/btrfs/327.out > @@ -0,0 +1,2 @@ > +QA output created by 327 > +277f3840b275c74d01e979ea9d75ac19 > -- > 2.46.0 > >