[PATCHSET 5.16-rcX 0/2] xfs: fix data corruption when cycling ro/rw mounts

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Tue, 07 Dec 2021 10:35:39 -0800

Hi all,

As part of a large customer escalation, I've been combing through the
XFS copy on write code to try to find sources of (mostly) silent data
corruption.  I found a nasty problem in the remount code, wherein a ro
remount can race with file reader threads and fail to clean out cached
inode COW forks.  A subsequent rw remount calls the COW staging extent
recovery code, which frees the space but does not update the records in
the cached inode COW forks.  This leads to massive fs corruption.

The first patch in this series is the critical fix for the race
condition.  The second patch is defensive in that it moves the COW
staging extent recovery so that it always happens at mount time to
protect us against future screwups.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=remount-fixes-5.16
---
 fs/xfs/xfs_mount.c   |   37 ++++++++++++++++++++++++++++---------
 fs/xfs/xfs_reflink.c |    4 +++-
 fs/xfs/xfs_super.c   |   23 +++++++++++------------
 3 files changed, 42 insertions(+), 22 deletions(-)