On Fri, Oct 02, 2020 at 09:30:06AM +0200, Christoph Hellwig wrote: > On Thu, Oct 01, 2020 at 09:22:36PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > > > In xfs_bui_item_recover, there exists a use-after-free bug with regards > > to the inode that is involved in the bmap replay operation. If the > > mapping operation does not complete, we call xfs_bmap_unmap_extent to > > create a deferred op to finish the unmapping work, and we retain a > > pointer to the incore inode. > > > > Unfortunately, the very next thing we do is commit the transaction and > > drop the inode. If reclaim tears down the inode before we try to finish > > the defer ops, we dereference garbage and blow up. Therefore, create a > > way to join inodes to the defer ops freezer so that we can maintain the > > xfs_inode reference until we're done with the inode. > > > > Note: This imposes the requirement that there be enough memory to keep > > every incore inode in memory throughout recovery. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > --- > > v5.2: rebase on updated defer capture patches > > --- > > fs/xfs/libxfs/xfs_defer.c | 55 ++++++++++++++++++++++++++++++++++++++------ > > fs/xfs/libxfs/xfs_defer.h | 11 +++++++-- > > fs/xfs/xfs_bmap_item.c | 8 ++---- > > fs/xfs/xfs_extfree_item.c | 2 +- > > fs/xfs/xfs_log_recover.c | 7 +++++- > > fs/xfs/xfs_refcount_item.c | 2 +- > > fs/xfs/xfs_rmap_item.c | 2 +- > > 7 files changed, 67 insertions(+), 20 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c > > index e19dc1ced7e6..4af5752f9830 100644 > > --- a/fs/xfs/libxfs/xfs_defer.c > > +++ b/fs/xfs/libxfs/xfs_defer.c > > @@ -16,6 +16,7 @@ > > #include "xfs_inode.h" > > #include "xfs_inode_item.h" > > #include "xfs_trace.h" > > +#include "xfs_icache.h" > > > > /* > > * Deferred Operations in XFS > > @@ -553,10 +554,14 @@ xfs_defer_move( > > * deferred ops state is transferred to the capture structure and the > > * transaction is then ready for the caller to commit it. If there are no > > * intent items to capture, this function returns NULL. > > + * > > + * If inodes are passed in and this function returns a capture structure, the > > + * inodes are now owned by the capture structure. > > */ > > static struct xfs_defer_capture * > > xfs_defer_ops_capture( > > - struct xfs_trans *tp) > > + struct xfs_trans *tp, > > + struct xfs_inode *ip) > > { > > struct xfs_defer_capture *dfc; > > > > @@ -582,6 +587,12 @@ xfs_defer_ops_capture( > > /* Preserve the log reservation size. */ > > dfc->dfc_logres = tp->t_log_res; > > > > + /* > > + * Transfer responsibility for unlocking and releasing the inodes to > > + * the capture structure. > > + */ > > + dfc->dfc_ip = ip; > > + > > Maybe rename ip to capture_ip? Ok. > > + ASSERT(ip == NULL || xfs_isilocked(ip, XFS_ILOCK_EXCL)); > > + > > /* If we don't capture anything, commit transaction and exit. */ > > + dfc = xfs_defer_ops_capture(tp, ip); > > + if (!dfc) { > > + error = xfs_trans_commit(tp); > > + if (ip) { > > + xfs_iunlock(ip, XFS_ILOCK_EXCL); > > + xfs_irele(ip); > > + } > > + return error; > > + } > > Instead of coming up with our own inode unlocking and release schemes, > can't we just require that the inode is joinged by passing the lock > flags to xfs_trans_ijoin, and piggy back on xfs_trans_commit unlocking > it in that case? Yes, and let's also xfs_iget(capture_ip->i_ino) to increase the incore inode's refcount, which would make it so that the caller would still unlock and rele the reference that they got. --D