On Thu, Oct 01, 2020 at 09:22:36PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > In xfs_bui_item_recover, there exists a use-after-free bug with regards > to the inode that is involved in the bmap replay operation. If the > mapping operation does not complete, we call xfs_bmap_unmap_extent to > create a deferred op to finish the unmapping work, and we retain a > pointer to the incore inode. > > Unfortunately, the very next thing we do is commit the transaction and > drop the inode. If reclaim tears down the inode before we try to finish > the defer ops, we dereference garbage and blow up. Therefore, create a > way to join inodes to the defer ops freezer so that we can maintain the > xfs_inode reference until we're done with the inode. > > Note: This imposes the requirement that there be enough memory to keep > every incore inode in memory throughout recovery. > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > --- > v5.2: rebase on updated defer capture patches > --- > fs/xfs/libxfs/xfs_defer.c | 55 ++++++++++++++++++++++++++++++++++++++------ > fs/xfs/libxfs/xfs_defer.h | 11 +++++++-- > fs/xfs/xfs_bmap_item.c | 8 ++---- > fs/xfs/xfs_extfree_item.c | 2 +- > fs/xfs/xfs_log_recover.c | 7 +++++- > fs/xfs/xfs_refcount_item.c | 2 +- > fs/xfs/xfs_rmap_item.c | 2 +- > 7 files changed, 67 insertions(+), 20 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c > index e19dc1ced7e6..4af5752f9830 100644 > --- a/fs/xfs/libxfs/xfs_defer.c > +++ b/fs/xfs/libxfs/xfs_defer.c > @@ -16,6 +16,7 @@ > #include "xfs_inode.h" > #include "xfs_inode_item.h" > #include "xfs_trace.h" > +#include "xfs_icache.h" > > /* > * Deferred Operations in XFS > @@ -553,10 +554,14 @@ xfs_defer_move( > * deferred ops state is transferred to the capture structure and the > * transaction is then ready for the caller to commit it. If there are no > * intent items to capture, this function returns NULL. > + * > + * If inodes are passed in and this function returns a capture structure, the > + * inodes are now owned by the capture structure. > */ > static struct xfs_defer_capture * > xfs_defer_ops_capture( > - struct xfs_trans *tp) > + struct xfs_trans *tp, > + struct xfs_inode *ip) > { > struct xfs_defer_capture *dfc; > > @@ -582,6 +587,12 @@ xfs_defer_ops_capture( > /* Preserve the log reservation size. */ > dfc->dfc_logres = tp->t_log_res; > > + /* > + * Transfer responsibility for unlocking and releasing the inodes to > + * the capture structure. > + */ > + dfc->dfc_ip = ip; > + Maybe rename ip to capture_ip? > + ASSERT(ip == NULL || xfs_isilocked(ip, XFS_ILOCK_EXCL)); > + > /* If we don't capture anything, commit transaction and exit. */ > + dfc = xfs_defer_ops_capture(tp, ip); > + if (!dfc) { > + error = xfs_trans_commit(tp); > + if (ip) { > + xfs_iunlock(ip, XFS_ILOCK_EXCL); > + xfs_irele(ip); > + } > + return error; > + } Instead of coming up with our own inode unlocking and release schemes, can't we just require that the inode is joinged by passing the lock flags to xfs_trans_ijoin, and piggy back on xfs_trans_commit unlocking it in that case?