Re: [PATCH 5/5] xfs: verify buffer, inode, and dquot items every tx commit

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 13 Jun 2024 17:04:47 +1000

On Wed, Jun 12, 2024 at 10:47:50AM -0700, Darrick J. Wong wrote:
> The actual defect here was an overzealous inode verifier, which was
> fixed in a separate patch.  This patch adds some transaction precommit
> functions for CONFIG_XFS_DEBUG=y mode so that we can detect these kinds
> of transient errors at transaction commit time, where it's much easier
> to find the root cause.

Ok, I can see the value in this for very strict integrity checking,
but I don't think that XONFIG_XFS_DEBUG context is right
for this level of checking. 

Think of the difference using xfs_assert_ilocked() with
CONFIG_XFS_DEBUG vs iusing CONFIG_PROVE_LOCKING to enable lockdep.
Lockdep checks a lot more about lock usage than our debug build
asserts and so may find deep, subtle issues that our asserts won't
find. However, that extra capability comes at a huge cost for
relatively little extra gain, and so most of the time people work
without CONFIG_PROVE_LOCKING enabled. A test run here or there, and
then when the code developement is done, but it's not used all the
time on every little change that is developed and tested.

In comparison, I can't remember the last time I did any testing with
CONFIG_XFS_DEBUG disabled. Even all my performance regression
testing is run with CONFIG_XFS_DEBUG=y, and a change like this one
would make any sort of load testing on debug kernels far to costly
and so all that testing would get done with debugging turned off.
That's a significant loss, IMO, because we'd lose more validation
from people turning CONFIG_XFS_DEBUG off than we'd gain from the
rare occasions this new commit verifier infrastructure would catch
a real bug.

Hence I think this should be pushed into a separate debug config
sub-option. Make it something we can easily turn on with
KASAN and lockdep when we our periodic costly extensive validation
test runs.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx