Hi all, Fuzz testing of the refcount btree demonstrated a weakness in validation of refcount btree records during normal runtime. The idea of using the upper bit of the rc_startblock field to separate the refcount records into one group for shared space and another for CoW staging extents was added at the last minute. The incore struct left this bit encoded in the upper bit of the startblock field, which makes it all too easy for arithmetic operations to overflow if we don't detect the cowflag properly. When I ran a norepair fuzz tester, I was able to crash the kernel on one of these accidental overflows by fuzzing a key record in a node block, which broke lookups. To fix the problem, make the domain (shared/cow) a separate field in the incore record. Unfortunately, a customer also hit this once in production. Due to bugs in the kernel running on the VM host, writes to the disk image would occasionally be lost. Given sufficient memory pressure on the VM guest, a refcountbt xfs_buf could be reclaimed and later reloaded from the stale copy on the virtual disk. The stale disk contents were a refcount btree leaf block full of records for the wrong domain, and this caused an infinite loop in the guest VM. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=refcount-cow-domain-6.1 --- fs/xfs/libxfs/xfs_format.h | 22 --- fs/xfs/libxfs/xfs_refcount.c | 269 ++++++++++++++++++++++++++---------- fs/xfs/libxfs/xfs_refcount.h | 9 + fs/xfs/libxfs/xfs_refcount_btree.c | 26 +++ fs/xfs/libxfs/xfs_types.h | 30 ++++ fs/xfs/scrub/refcount.c | 72 ++++------ fs/xfs/xfs_trace.h | 48 +++++- 7 files changed, 324 insertions(+), 152 deletions(-)