On Sun, Oct 02, 2022 at 11:20:12AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Keys for extent interval records in the reverse mapping btree are > supposed to be computed as follows: > > (physical block, owner, fork, is_btree, offset) > > This provides users the ability to look up a reverse mapping from a file > block mapping record -- start with the physical block; then if there are > multiple records for the same block, move on to the owner; then the > inode fork type; and so on to the file offset. > > However, the key comparison functions incorrectly remove the fork/bmbt > information that's encoded in the on-disk offset. This means that > lookup comparisons are only done with: > > (physical block, owner, offset) > > This means that queries can return incorrect results. On consistent > filesystems this isn't an issue because bmbt blocks and blocks mapped to > an attr fork cannot be shared, but this prevents us from detecting > incorrect fork and bmbt flag bits in the rmap btree. > > A previous version of this patch forgot to keep the (un)written state > flag masked during the comparison and caused a major regression in > 5.9.x since unwritten extent conversion can update an rmap record > without requiring key updates. > > Note that blocks cannot go directly from data fork to attr fork without > being deallocated and reallocated, nor can they be added to or removed > from a bmbt without a free/alloc cycle, so this should not cause any > regressions. > > Found by fuzzing keys[1].attrfork = ones on xfs/371. > > Fixes: 4b8ed67794fe ("xfs: add rmap btree operations") > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- > fs/xfs/libxfs/xfs_rmap_btree.c | 25 +++++++++++++++++-------- > 1 file changed, 17 insertions(+), 8 deletions(-) > > > diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c > index 7f83f62e51e0..e2e1f68cedf5 100644 > --- a/fs/xfs/libxfs/xfs_rmap_btree.c > +++ b/fs/xfs/libxfs/xfs_rmap_btree.c > @@ -219,6 +219,15 @@ xfs_rmapbt_init_ptr_from_cur( > ptr->s = agf->agf_roots[cur->bc_btnum]; > } > > +/* > + * Fork and bmbt are significant parts of the rmap record key, but written > + * status is merely a record attribute. > + */ > +static inline uint64_t offset_keymask(uint64_t offset) > +{ > + return offset & ~XFS_RMAP_OFF_UNWRITTEN; > +} Ok. but doesn't that mean xfs_rmapbt_init_key_from_rec() and xfs_rmapbt_init_high_key_from_rec() should be masking out the XFS_RMAP_OFF_UNWRITTEN bit as well? -Dave. -- Dave Chinner david@xxxxxxxxxxxxx