This is a note to let you know that I've just added the patch titled xfs: check for race with xfs_reclaim_inode() in xfs_ifree_cluster() to the 4.13-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: xfs-check-for-race-with-xfs_reclaim_inode-in-xfs_ifree_cluster.patch and it can be found in the queue-4.13 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From foo@baz Mon Sep 18 10:25:08 CEST 2017 From: Christoph Hellwig <hch@xxxxxx> Date: Sun, 17 Sep 2017 14:06:17 -0700 Subject: xfs: check for race with xfs_reclaim_inode() in xfs_ifree_cluster() To: stable@xxxxxxxxxxxxxxx Cc: linux-xfs@xxxxxxxxxxxxxxx, Omar Sandoval <osandov@xxxxxx>, "Darrick J . Wong" <darrick.wong@xxxxxxxxxx> Message-ID: <20170917210631.10725-12-hch@xxxxxx> From: Omar Sandoval <osandov@xxxxxx> commit f2e9ad212def50bcf4c098c6288779dd97fff0f0 upstream. After xfs_ifree_cluster() finds an inode in the radix tree and verifies that the inode number is what it expected, xfs_reclaim_inode() can swoop in and free it. xfs_ifree_cluster() will then happily continue working on the freed inode. Most importantly, it will mark the inode stale, which will probably be overwritten when the inode slab object is reallocated, but if it has already been reallocated then we can end up with an inode spuriously marked stale. In 8a17d7ddedb4 ("xfs: mark reclaimed inodes invalid earlier") we added a second check to xfs_iflush_cluster() to detect this race, but the similar RCU lookup in xfs_ifree_cluster() needs the same treatment. Signed-off-by: Omar Sandoval <osandov@xxxxxx> Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/xfs/xfs_icache.c | 10 +++++----- fs/xfs/xfs_inode.c | 23 ++++++++++++++++++----- 2 files changed, 23 insertions(+), 10 deletions(-) --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -1124,11 +1124,11 @@ reclaim: * Because we use RCU freeing we need to ensure the inode always appears * to be reclaimed with an invalid inode number when in the free state. * We do this as early as possible under the ILOCK so that - * xfs_iflush_cluster() can be guaranteed to detect races with us here. - * By doing this, we guarantee that once xfs_iflush_cluster has locked - * XFS_ILOCK that it will see either a valid, flushable inode that will - * serialise correctly, or it will see a clean (and invalid) inode that - * it can skip. + * xfs_iflush_cluster() and xfs_ifree_cluster() can be guaranteed to + * detect races with us here. By doing this, we guarantee that once + * xfs_iflush_cluster() or xfs_ifree_cluster() has locked XFS_ILOCK that + * it will see either a valid inode that will serialise correctly, or it + * will see an invalid inode that it can skip. */ spin_lock(&ip->i_flags_lock); ip->i_flags = XFS_IRECLAIM; --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2359,11 +2359,24 @@ retry: * already marked stale. If we can't lock it, back off * and retry. */ - if (ip != free_ip && - !xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) { - rcu_read_unlock(); - delay(1); - goto retry; + if (ip != free_ip) { + if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) { + rcu_read_unlock(); + delay(1); + goto retry; + } + + /* + * Check the inode number again in case we're + * racing with freeing in xfs_reclaim_inode(). + * See the comments in that function for more + * information as to why the initial check is + * not sufficient. + */ + if (ip->i_ino != inum + i) { + xfs_iunlock(ip, XFS_ILOCK_EXCL); + continue; + } } rcu_read_unlock(); Patches currently in stable-queue which might be from hch@xxxxxx are queue-4.13/xfs-open-code-xfs_buf_item_dirty.patch queue-4.13/xfs-properly-retry-failed-inode-items-in-case-of-error-during-buffer-writeback.patch queue-4.13/xfs-use-kmem_free-to-free-return-value-of-kmem_zalloc.patch queue-4.13/xfs-add-infrastructure-needed-for-error-propagation-during-buffer-io-failure.patch queue-4.13/xfs-don-t-set-v3-xflags-for-v2-inodes.patch queue-4.13/xfs-toggle-readonly-state-around-xfs_log_mount_finish.patch queue-4.13/xfs-fix-log-recovery-corruption-error-due-to-tail-overwrite.patch queue-4.13/xfs-move-bmbt-owner-change-to-last-step-of-extent-swap.patch queue-4.13/xfs-check-for-race-with-xfs_reclaim_inode-in-xfs_ifree_cluster.patch queue-4.13/xfs-always-verify-the-log-tail-during-recovery.patch queue-4.13/xfs-open-code-end_buffer_async_write-in-xfs_finish_page_writeback.patch queue-4.13/xfs-relog-dirty-buffers-during-swapext-bmbt-owner-change.patch queue-4.13/xfs-disable-per-inode-dax-flag.patch queue-4.13/xfs-refactor-buffer-logging-into-buffer-dirtying-helper.patch queue-4.13/xfs-fix-recovery-failure-when-log-record-header-wraps-log-end.patch queue-4.13/xfs-skip-bmbt-block-ino-validation-during-owner-change.patch queue-4.13/xfs-don-t-log-dirty-ranges-for-ordered-buffers.patch queue-4.13/xfs-stop-searching-for-free-slots-in-an-inode-chunk-when-there-are-none.patch queue-4.13/xfs-fix-incorrect-log_flushed-on-fsync.patch queue-4.13/xfs-evict-all-inodes-involved-with-log-redo-item.patch queue-4.13/xfs-write-unmount-record-for-ro-mounts.patch queue-4.13/xfs-remove-unnecessary-dirty-bli-format-check-for-ordered-bufs.patch queue-4.13/xfs-disallow-marking-previously-dirty-buffers-as-ordered.patch queue-4.13/xfs-handle-efscorrupted-during-head-tail-verification.patch queue-4.13/xfs-ordered-buffer-log-items-are-never-formatted.patch