Re: [PATCH 3/3] xfs: fix an incore inode UAF in xfs_bui_recover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 28, 2020 at 04:10:46PM +1000, Dave Chinner wrote:
> On Sun, Sep 27, 2020 at 04:41:56PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > 
> > In xfs_bui_item_recover, there exists a use-after-free bug with regards
> > to the inode that is involved in the bmap replay operation.  If the
> > mapping operation does not complete, we call xfs_bmap_unmap_extent to
> > create a deferred op to finish the unmapping work, and we retain a
> > pointer to the incore inode.
> > 
> > Unfortunately, the very next thing we do is commit the transaction and
> > drop the inode.  If reclaim tears down the inode before we try to finish
> > the defer ops, we dereference garbage and blow up.  Therefore, create a
> > way to join inodes to the defer ops freezer so that we can maintain the
> > xfs_inode reference until we're done with the inode.
> 
> Honest first reaction now I understand what the capture stuff is
> doing: Ewww! Gross!

Yes, the whole thing is gross.  Honestly, I wish I could go back in time
to 2016 to warn myself that we would need a way to reassemble entire
runtime transactions + dfops chains so that we could avoid all this.

> We only need to store a single inode, so the whole "2 inodes for
> symmetry with defer_ops" greatly overcomplicates the code. This
> could be *much* simpler.

Indeed, see my comment at the very end.

> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index deb99300d171..c7f65e16534f 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -12,6 +12,7 @@
> >  #include "xfs_sb.h"
> >  #include "xfs_mount.h"
> >  #include "xfs_inode.h"
> > +#include "xfs_defer.h"
> >  #include "xfs_trans.h"
> >  #include "xfs_trans_priv.h"
> >  #include "xfs_inode_item.h"
> > @@ -1689,3 +1690,43 @@ xfs_start_block_reaping(
> >  	xfs_queue_eofblocks(mp);
> >  	xfs_queue_cowblocks(mp);
> >  }
> > +
> > +/*
> > + * Prepare the inodes to participate in further log intent item recovery.
> > + * For now, that means attaching dquots and locking them, since libxfs doesn't
> > + * know how to do that.
> > + */
> > +void
> > +xfs_defer_continue_inodes(
> > +	struct xfs_defer_capture	*dfc,
> > +	struct xfs_trans		*tp)
> > +{
> > +	int				i;
> > +	int				error;
> > +
> > +	for (i = 0; i < XFS_DEFER_OPS_NR_INODES && dfc->dfc_inodes[i]; i++) {
> > +		error = xfs_qm_dqattach(dfc->dfc_inodes[i]);
> > +		if (error)
> > +			tp->t_mountp->m_qflags &= ~XFS_ALL_QUOTA_CHKD;
> > +	}
> > +
> > +	if (dfc->dfc_inodes[1])
> > +		xfs_lock_two_inodes(dfc->dfc_inodes[0], XFS_ILOCK_EXCL,
> > +				    dfc->dfc_inodes[1], XFS_ILOCK_EXCL);
> > +	else if (dfc->dfc_inodes[0])
> > +		xfs_ilock(dfc->dfc_inodes[0], XFS_ILOCK_EXCL);
> > +	dfc->dfc_ilocked = true;
> > +}
> > +
> > +/* Release all the inodes attached to this dfops capture device. */
> > +void
> > +xfs_defer_capture_irele(
> > +	struct xfs_defer_capture	*dfc)
> > +{
> > +	unsigned int			i;
> > +
> > +	for (i = 0; i < XFS_DEFER_OPS_NR_INODES && dfc->dfc_inodes[i]; i++) {
> > +		xfs_irele(dfc->dfc_inodes[i]);
> > +		dfc->dfc_inodes[i] = NULL;
> > +	}
> > +}
> 
> None of this belongs in xfs_icache.c. The function namespace tells
> me where it should be...

Agreed.  Originally this couldn't really be in libxfs because xfs_iget
has a different method signature in userspace, but now that we're just
storing the inode pointers directly, there's no need to split this
anymore.

> > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > index 0d899ab7df2e..1463c3097240 100644
> > --- a/fs/xfs/xfs_log_recover.c
> > +++ b/fs/xfs/xfs_log_recover.c
> > @@ -1755,23 +1755,43 @@ xlog_recover_release_intent(
> >  	spin_unlock(&ailp->ail_lock);
> >  }
> >  
> > +static inline void
> > +xlog_recover_irele(
> > +	struct xfs_inode	*ip)
> > +{
> > +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > +	xfs_irele(ip);
> > +}
> 
> Just open code it, please.
> 
> >  int
> > -xlog_recover_trans_commit(
> > +xlog_recover_trans_commit_inodes(
> >  	struct xfs_trans		*tp,
> > -	struct list_head		*capture_list)
> > +	struct list_head		*capture_list,
> > +	struct xfs_inode		*ip1,
> > +	struct xfs_inode		*ip2)
> 
> So are these inodes supposed to be locked, referenced and/or ???

ILOCK'd and referenced.

> >  {
> >  	struct xfs_mount		*mp = tp->t_mountp;
> > -	struct xfs_defer_capture	*dfc = xfs_defer_capture(tp);
> > +	struct xfs_defer_capture	*dfc = xfs_defer_capture(tp, ip1, ip2);
> >  	int				error;
> 
> That's the second time putting this logic up in the declaration list
> has made me wonder where something in this function is initilaised.
> Please move it into the code so that it is obvious.
> 
> >  
> >  	/* If we don't capture anything, commit tp and exit. */
> > -	if (!dfc)
> > -		return xfs_trans_commit(tp);
> > +	if (!dfc) {
> 
> i.e. before this line.
> 
> 	dfc = xfs_defer_capture(tp, ip1, ip2);
> 	if (!dfc) {

Ok.

> 
> > +		error = xfs_trans_commit(tp);
> > +
> > +		/* We still own the inodes, so unlock and release them. */
> > +		if (ip2 && ip2 != ip1)
> > +			xlog_recover_irele(ip2);
> > +		if (ip1)
> > +			xlog_recover_irele(ip1);
> > +		return error;
> > +	}
> 
> Not a fan of the unnecessary complexity of this.

Yeah, I got ahead of myself -- for atomic extent swapping we'll need to
be able to capture two inodes, so I went straight for the end goal.
I'll rip it out to simplify things for now, but this all will come back
in some form...

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux