On Tue, May 16, 2023 at 06:20:59PM -0700, Darrick J. Wong wrote: > On Wed, May 17, 2023 at 10:04:48AM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > To fix a AGI-AGF-inode cluster buffer deadlock, we need to move > > inode cluster buffer oeprations to the ->iop_precommit() method. > > operations > > > However, this means that deferred operations can require precommits > > to be run on the final transaction that the deferred ops pass back > > to xfs_trans_commit() context. This will be exposed by attribute > > handling, in that the last changes to the inode in the attr set > > state machine "disappear" because the precommit operation is not run. > > Wait, what? That was exactly the reaction I had when attribute tests failed unexpectedly. Then a quick bit of traceprintk debugging (which I probably forgot to remove!) pointed me at shortform attr updates where no precommit was running... > OH, this is because the LARP state machine can log the inode in the > final transaction in its chain. __xfs_trans_commit calls the noroll > variant of xfs_defer_finish, which means that when we get to... Yes. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > --- > > fs/xfs/xfs_trans.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > > index 8afc0c080861..664084509af5 100644 > > --- a/fs/xfs/xfs_trans.c > > +++ b/fs/xfs/xfs_trans.c > > @@ -970,6 +970,11 @@ __xfs_trans_commit( > > error = xfs_defer_finish_noroll(&tp); > > if (error) > > goto out_unreserve; > > ...here, tp might actually be a dirty transaction. Prior to > xlog_cil_committing the dirty transaction, we need to run the precommit > hooks or else we don't actually (re)attach the inode cluster buffer to > the transaction, and everything goes batty from there. Right? Yes. > This isn't specific to LARP; any defer item that logs an item with a > precommit hook is subject to this. Right? Yes. But right now, the only precommit hook is the unlinked inode processing, which doesn't run from defer-ops context... > > + > > + /* Run precomits from final tx in defer chain */ > > precommits > > If the answers to the last 2 questions are 'yes' and the spelling errors > get fixed, I think I'm ok enough with this one to throw it at > fstestscloud. > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> Thanks! -Dave. -- Dave Chinner david@xxxxxxxxxxxxx