This is a note to let you know that I've just added the patch titled xfs: invalidate block device page cache during unmount to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: xfs-invalidate-block-device-page-cache-during-unmount.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From stable+bounces-42903-greg=kroah.com@xxxxxxxxxxxxxxx Wed May 1 20:41:59 2024 From: Leah Rumancik <leah.rumancik@xxxxxxxxx> Date: Wed, 1 May 2024 11:41:02 -0700 Subject: xfs: invalidate block device page cache during unmount To: stable@xxxxxxxxxxxxxxx Cc: linux-xfs@xxxxxxxxxxxxxxx, amir73il@xxxxxxxxx, chandan.babu@xxxxxxxxxx, fred@xxxxxxxxxxxxxx, "Darrick J. Wong" <djwong@xxxxxxxxxx>, Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx>, Dave Chinner <dchinner@xxxxxxxxxx>, Leah Rumancik <leah.rumancik@xxxxxxxxx> Message-ID: <20240501184112.3799035-14-leah.rumancik@xxxxxxxxx> From: "Darrick J. Wong" <djwong@xxxxxxxxxx> [ Upstream commit 032e160305f6872e590c77f11896fb28365c6d6c ] Every now and then I see fstests failures on aarch64 (64k pages) that trigger on the following sequence: mkfs.xfs $dev mount $dev $mnt touch $mnt/a umount $mnt xfs_db -c 'path /a' -c 'print' $dev 99% of the time this succeeds, but every now and then xfs_db cannot find /a and fails. This turns out to be a race involving udev/blkid, the page cache for the block device, and the xfs_db process. udev is triggered whenever anyone closes a block device or unmounts it. The default udev rules invoke blkid to read the fs super and create symlinks to the bdev under /dev/disk. For this, it uses buffered reads through the page cache. xfs_db also uses buffered reads to examine metadata. There is no coordination between xfs_db and udev, which means that they can run concurrently. Note there is no coordination between the kernel and blkid either. On a system with 64k pages, the page cache can cache the superblock and the root inode (and hence the root dir) with the same 64k page. If udev spawns blkid after the mkfs and the system is busy enough that it is still running when xfs_db starts up, they'll both read from the same page in the pagecache. The unmount writes updated inode metadata to disk directly. The XFS buffer cache does not use the bdev pagecache, nor does it invalidate the pagecache on umount. If the above scenario occurs, the pagecache no longer reflects what's on disk, xfs_db reads the stale metadata, and fails to find /a. Most of the time this succeeds because closing a bdev invalidates the page cache, but when processes race, everyone loses. Fix the problem by invalidating the bdev pagecache after flushing the bdev, so that xfs_db will see up to date metadata. Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> Reviewed-by: Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Leah Rumancik <leah.rumancik@xxxxxxxxx> Acked-by: Darrick J. Wong <djwong@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/xfs/xfs_buf.c | 1 + 1 file changed, 1 insertion(+) --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -1945,6 +1945,7 @@ xfs_free_buftarg( list_lru_destroy(&btp->bt_lru); blkdev_issue_flush(btp->bt_bdev); + invalidate_bdev(btp->bt_bdev); fs_put_dax(btp->bt_daxdev, btp->bt_mount); kmem_free(btp); Patches currently in stable-queue which might be from kroah.com@xxxxxxxxxxxxxxx are queue-6.1/xfs-iomap-move-delalloc-punching-to-iomap.patch queue-6.1/xfs-fix-off-by-one-block-in-xfs_discard_folio.patch queue-6.1/xfs-invalidate-block-device-page-cache-during-unmount.patch queue-6.1/xfs-drop-write-error-injection-is-unfixable-remove-it.patch queue-6.1/iomap-buffered-write-failure-should-not-truncate-the-page-cache.patch queue-6.1/xfs-fix-super-block-buf-log-item-uaf-during-force-shutdown.patch queue-6.1/xfs-fix-incorrect-i_nlink-caused-by-inode-racing.patch queue-6.1/xfs-estimate-post-merge-refcounts-correctly.patch queue-6.1/xfs-fix-log-recovery-when-unknown-rocompat-bits-are-set.patch queue-6.1/xfs-punching-delalloc-extents-on-write-failure-is-racy.patch queue-6.1/xfs-allow-inode-inactivation-during-a-ro-mount-log-recovery.patch queue-6.1/iomap-write-iomap-validity-checks.patch queue-6.1/xfs-attach-dquots-to-inode-before-reading-data-cow-fork-mappings.patch queue-6.1/xfs-fix-sb-write-verify-for-lazysbcount.patch queue-6.1/xfs-wait-iclog-complete-before-tearing-down-ail.patch queue-6.1/xfs-use-byte-ranges-for-write-cleanup-ranges.patch queue-6.1/xfs-xfs_bmap_punch_delalloc_range-should-take-a-byte-range.patch queue-6.1/xfs-write-page-faults-in-iomap-are-not-buffered-writes.patch queue-6.1/xfs-short-circuit-xfs_growfs_data_private-if-delta-is-zero.patch queue-6.1/xfs-fix-incorrect-error-out-in-xfs_remove.patch queue-6.1/xfs-invalidate-xfs_bufs-when-allocating-cow-extents.patch queue-6.1/xfs-hoist-refcount-record-merge-predicates.patch queue-6.1/xfs-get-root-inode-correctly-at-bulkstat.patch queue-6.1/xfs-use-iomap_valid-method-to-detect-stale-cached-iomaps.patch