On Wed, Aug 12, 2020 at 07:25:56PM +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > The O_TMPFILE creation implementation creates a specific order of > operations for inode allocation/freeing and unlinked list > modification. Currently both are serialised by the AGI, so the order > doesn't strictly matter as long as the are both in the same > transaction. > > However, if we want to move the unlinked list insertions largely > out from under the AGI lock, then we have to be concerned about the > order in which we do unlinked list modification operations. > O_TMPFILE creation tells us this order is inode allocation/free, > then unlinked list modification. > > Change xfs_ifree() to use this same ordering on unlinked list > removal. THis way we always guarantee that when we enter the "This"... > iunlinked list removal code from this path, we have the already "have the already locked" ... what do we have locked? The AGI? > locked and we don't have to worry about lock nesting AGI reads > inside unlink list locks because it's already locked and attached to > the transaction. > > We can do this safely as the inode freeing and unlinked list removal > are done in the same transaction and hence are atomic operations > with resepect to log recovery. "respect"... With the commit log edited a bit, Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> --D > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/xfs_inode.c | 22 ++++++++++++---------- > 1 file changed, 12 insertions(+), 10 deletions(-) > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > index ce128ff12762..7ee778bcde06 100644 > --- a/fs/xfs/xfs_inode.c > +++ b/fs/xfs/xfs_inode.c > @@ -2283,14 +2283,13 @@ xfs_ifree_cluster( > } > > /* > - * This is called to return an inode to the inode free list. > - * The inode should already be truncated to 0 length and have > - * no pages associated with it. This routine also assumes that > - * the inode is already a part of the transaction. > + * This is called to return an inode to the inode free list. The inode should > + * already be truncated to 0 length and have no pages associated with it. This > + * routine also assumes that the inode is already a part of the transaction. > * > - * The on-disk copy of the inode will have been added to the list > - * of unlinked inodes in the AGI. We need to remove the inode from > - * that list atomically with respect to freeing it here. > + * The on-disk copy of the inode will have been added to the list of unlinked > + * inodes in the AGI. We need to remove the inode from that list atomically with > + * respect to freeing it here. > */ > int > xfs_ifree( > @@ -2308,13 +2307,16 @@ xfs_ifree( > ASSERT(ip->i_d.di_nblocks == 0); > > /* > - * Pull the on-disk inode from the AGI unlinked list. > + * Free the inode first so that we guarantee that the AGI lock is going > + * to be taken before we remove the inode from the unlinked list. This > + * makes the AGI lock -> unlinked list modification order the same as > + * used in O_TMPFILE creation. > */ > - error = xfs_iunlink_remove(tp, ip); > + error = xfs_difree(tp, ip->i_ino, &xic); > if (error) > return error; > > - error = xfs_difree(tp, ip->i_ino, &xic); > + error = xfs_iunlink_remove(tp, ip); > if (error) > return error; > > -- > 2.26.2.761.g0e0b3e54be >