[PATCHSET V2 for-5.16 0/2] xfs: fix data corruption when cycling ro/rw mounts

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Wed, 08 Dec 2021 15:15:04 -0800

Hi all,

As part of a large customer escalation, I've been combing through the
XFS copy on write code to try to find sources of (mostly) silent data
corruption.  I found a nasty problem in the remount code, wherein a ro
remount can race with file reader threads and fail to clean out cached
inode COW forks.  A subsequent rw remount calls the COW staging extent
recovery code, which frees the space but does not update the records in
the cached inode COW forks.  This leads to massive fs corruption.

The first patch in this series is the critical fix for the race
condition.  The second patch is defensive in that it moves the COW
staging extent recovery so that it always happens at mount time to
protect us against future screwups.

v2: rework comments, move xfs_reflink_recover_cow to xfs_log_mount_finish

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=remount-fixes-5.16
---
 fs/xfs/xfs_log.c     |   23 ++++++++++++++++++++++-
 fs/xfs/xfs_mount.c   |   10 ----------
 fs/xfs/xfs_reflink.c |    5 ++++-
 fs/xfs/xfs_super.c   |   23 +++++++++++------------
 4 files changed, 37 insertions(+), 24 deletions(-)