Re: [PATCH 1/2] xfs: fix rmap key comparison functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 02, 2022 at 10:40:22AM +1100, Dave Chinner wrote:
> On Sun, Oct 02, 2022 at 11:20:12AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > Keys for extent interval records in the reverse mapping btree are
> > supposed to be computed as follows:
> > 
> > (physical block, owner, fork, is_btree, offset)
> > 
> > This provides users the ability to look up a reverse mapping from a file
> > block mapping record -- start with the physical block; then if there are
> > multiple records for the same block, move on to the owner; then the
> > inode fork type; and so on to the file offset.
> > 
> > However, the key comparison functions incorrectly remove the fork/bmbt
> > information that's encoded in the on-disk offset.  This means that
> > lookup comparisons are only done with:
> > 
> > (physical block, owner, offset)
> > 
> > This means that queries can return incorrect results.  On consistent
> > filesystems this isn't an issue because bmbt blocks and blocks mapped to
> > an attr fork cannot be shared, but this prevents us from detecting
> > incorrect fork and bmbt flag bits in the rmap btree.
> > 
> > A previous version of this patch forgot to keep the (un)written state
> > flag masked during the comparison and caused a major regression in
> > 5.9.x since unwritten extent conversion can update an rmap record
> > without requiring key updates.
> > 
> > Note that blocks cannot go directly from data fork to attr fork without
> > being deallocated and reallocated, nor can they be added to or removed
> > from a bmbt without a free/alloc cycle, so this should not cause any
> > regressions.
> > 
> > Found by fuzzing keys[1].attrfork = ones on xfs/371.
> > 
> > Fixes: 4b8ed67794fe ("xfs: add rmap btree operations")
> > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> > ---
> >  fs/xfs/libxfs/xfs_rmap_btree.c |   25 +++++++++++++++++--------
> >  1 file changed, 17 insertions(+), 8 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c
> > index 7f83f62e51e0..e2e1f68cedf5 100644
> > --- a/fs/xfs/libxfs/xfs_rmap_btree.c
> > +++ b/fs/xfs/libxfs/xfs_rmap_btree.c
> > @@ -219,6 +219,15 @@ xfs_rmapbt_init_ptr_from_cur(
> >  	ptr->s = agf->agf_roots[cur->bc_btnum];
> >  }
> >  
> > +/*
> > + * Fork and bmbt are significant parts of the rmap record key, but written
> > + * status is merely a record attribute.
> > + */
> > +static inline uint64_t offset_keymask(uint64_t offset)
> > +{
> > +	return offset & ~XFS_RMAP_OFF_UNWRITTEN;
> > +}
> 
> Ok. but doesn't that mean xfs_rmapbt_init_key_from_rec() and
> xfs_rmapbt_init_high_key_from_rec() should be masking out the
> XFS_RMAP_OFF_UNWRITTEN bit as well?

It ought to, but it might be too late for that because
_init_*key_from_rec have been letting the unwritten bit slip into the
rmap key structures since 4.8.  Somewhere out there is a filesystem with
rmapbt node blocks containing struct xfs_rmap_key's with that unwritten
bit set.  The best we can do is ignore it in the key comparison
function.

Let me think about this overnight though, because once we stop paying
attention to the unwritten bit for key comparisons, it might not matter
what's in the ondisk node blocks.

--D

> -Dave.
> 
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux