Re: [PATCH V14 00/16] Bail out if transaction can cause extent count to overflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 23, 2022 at 02:15:44PM +0300, Amir Goldstein wrote:
> On Sun, Jan 10, 2021 at 6:10 PM Chandan Babu R <chandanrlinux@xxxxxxxxx> wrote:
> >
> > XFS does not check for possible overflow of per-inode extent counter
> > fields when adding extents to either data or attr fork.
> >
> > For e.g.
> > 1. Insert 5 million xattrs (each having a value size of 255 bytes) and
> >    then delete 50% of them in an alternating manner.
> >
> > 2. On a 4k block sized XFS filesystem instance, the above causes 98511
> >    extents to be created in the attr fork of the inode.
> >
> >    xfsaild/loop0  2008 [003]  1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131
> >
> > 3. The incore inode fork extent counter is a signed 32-bit
> >    quantity. However, the on-disk extent counter is an unsigned 16-bit
> >    quantity and hence cannot hold 98511 extents.
> >
> > 4. The following incorrect value is stored in the xattr extent counter,
> >    # xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0
> >    core.naextents = -32561
> >
> > This patchset adds a new helper function
> > (i.e. xfs_iext_count_may_overflow()) to check for overflow of the
> > per-inode data and xattr extent counters and invokes it before
> > starting an fs operation (e.g. creating a new directory entry). With
> > this patchset applied, XFS detects counter overflows and returns with
> > an error rather than causing a silent corruption.
> >
> > The patchset has been tested by executing xfstests with the following
> > mkfs.xfs options,
> > 1. -m crc=0 -b size=1k
> > 2. -m crc=0 -b size=4k
> > 3. -m crc=0 -b size=512
> > 4. -m rmapbt=1,reflink=1 -b size=1k
> > 5. -m rmapbt=1,reflink=1 -b size=4k
> >
> > The patches can also be obtained from
> > https://github.com/chandanr/linux.git at branch xfs-reserve-extent-count-v14.
> >
> > I have two patches that define the newly introduced error injection
> > tags in xfsprogs
> > (https://lore.kernel.org/linux-xfs/20201104114900.172147-1-chandanrlinux@xxxxxxxxx/).
> >
> > I have also written tests
> > (https://github.com/chandanr/xfstests/commits/extent-overflow-tests)
> > for verifying the checks introduced in the kernel.
> >
> 
> Hi Chandan and XFS folks,
> 
> As you may have heard, I am working on producing a series of
> xfs patches for stable v5.10.y.
> 
> My patch selection is documented at [1].
> I am in the process of testing the backport patches against the 5.10.y
> baseline using Luis' kdevops [2] fstests runner.
> 
> The configurations that we are testing are:
> 1. -m rmbat=0,reflink=1 -b size=4k (default)
> 2. -m crc=0 -b size=4k
> 3. -m crc=0 -b size=512
> 4. -m rmapbt=1,reflink=1 -b size=1k
> 5. -m rmapbt=1,reflink=1 -b size=4k
> 
> This patch set is the only largish series that I selected, because:
> - It applies cleanly to 5.10.y
> - I evaluated it as low risk and high value

What value does it provide LTS users?

This series adds almost no value to normal users - extent count
overflows are just something that doesn't happen in production
systems at this point in time. The largest data extent count I've
ever seen is still an order of magnitude of extents away from
overflowing (i.e. 400 million extents seen, 4 billion to overflow),
and nobody is using the attribute fork sufficiently hard to overflow
65536 extents (typically a couple of million xattrs per inode).

i.e. this series is ground work for upcoming internal filesystem
functionality that require much larger attribute forks (parent
pointers and fsverity merkle tree storage) to be supported, and
allow scope for much larger, massively fragmented VM image files
(beyond 16TB on 4kB block size fs for worst case
fragmentation/reflink). 

As a standalone patchset, this provides almost no real benefit to
users but adds a whole new set of "hard stop" error paths across
every operation that does inode data/attr extent allocation. i.e.
the scope of affected functionality is very wide, the benefit
to users is pretty much zero.

Hence I'm left wondering what criteria ranks this as a high value
change...

> - Chandan has written good regression tests
>
> I intend to post the rest of the individual selected patches
> for review in small batches after they pass the tests, but w.r.t this
> patch set -
> 
> Does anyone object to including it in the stable kernel
> after it passes the tests?

I prefer that the process doesn't result in taking random unnecesary
functionality into stable kernels. The part of the LTS process that
I've most disagreed with is the "backport random unnecessary
changes" part of the stable selection criteria. It doesn't matter if
it's selected by a bot or a human, the problems that causes are the
same.

Hence on those grounds, I'd say this isn't a stable backport
candidate at all...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux