Re: [PATCH 05/12] xfs: track cow/shared record domains explicitly in xfs_refcount_irec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 28, 2022 at 08:03:07AM +1100, Dave Chinner wrote:
> On Thu, Oct 27, 2022 at 10:14:31AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > Just prior to committing the reflink code into upstream, the xfs
> > maintainer at the time requested that I find a way to shard the refcount
> > records into two domains -- one for records tracking shared extents, and
> > a second for tracking CoW staging extents.  The idea here was to
> > minimize mount time CoW reclamation by pushing all the CoW records to
> > the right edge of the keyspace, and it was accomplished by setting the
> > upper bit in rc_startblock.  We don't allow AGs to have more than 2^31
> > blocks, so the bit was free.
> > 
> > Unfortunately, this was a very late addition to the codebase, so most of
> > the refcount record processing code still treats rc_startblock as a u32
> > and pays no attention to whether or not the upper bit (the cow flag) is
> > set.  This is a weakness is theoretically exploitable, since we're not
> > fully validating the incoming metadata records.
> > 
> > Fuzzing demonstrates practical exploits of this weakness.  If the cow
> > flag of a node block key record is corrupted, a lookup operation can go
> > to the wrong record block and start returning records from the wrong
> > cow/shared domain.  This causes the math to go all wrong (since cow
> > domain is still implicit in the upper bit of rc_startblock) and we can
> > crash the kernel by tricking xfs into jumping into a nonexistent AG and
> > tripping over xfs_perag_get(mp, <nonexistent AG>) returning NULL.
> > 
> > To fix this, start tracking the domain as an explicit part of struct
> > xfs_refcount_irec, adjust all refcount functions to check the domain
> > of a returned record, and alter the function definitions to accept them
> > where necessary.
> > 
> > Found by fuzzing keys[2].cowflag = add in xfs/464.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> Couple of minor things.
> 
> > @@ -169,12 +183,17 @@ xfs_refcount_update(
> >  	struct xfs_refcount_irec	*irec)
> >  {
> >  	union xfs_btree_rec	rec;
> > +	__u32			start;
> >  	int			error;
> 
> Why __u32 and not, say, u32 or uint32_t? u32 is used 10x more often
> than __u32 in the kernel code, and in XFS only seem to use the __
> variants in UAPI structures.

Force of habit, I guess.  I'll rework it as uint32_t.

> > @@ -364,6 +388,7 @@ xfs_refcount_split_extent(
> >  		error = -EFSCORRUPTED;
> >  		goto out_error;
> >  	}
> > +
> >  	if (rcext.rc_startblock == agbno || xfs_refc_next(&rcext) <= agbno)
> >  		return 0;
> >  
> 
> Random new whitespace?

Oops.  I overlooked that when I was slicing and dicing yesterday.

> Other than that it looks good.
> 
> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>

Thanks!

--D

> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux