Hi Darrick, Can you please pull the new lockless buffer cache lookup code from the tag below? THis contains all the little remaining cleanups (rvbs, typos, etc) I've made since the last version posted to the list. It merges cleanly against a current for-next tree, with out without the iunlink item branch merged into it. Cheers, Dave. ------ The following changes since commit 7561cea5dbb97fecb952548a0fb74fb105bf4664: xfs: prevent a UAF when log IO errors race with unmount (2022-07-01 09:09:52 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs tags/xfs-buf-lockless-lookup-5.20 for you to fetch changes up to 298f342245066309189d8637ca7339d56840c3e1: xfs: lockless buffer lookup (2022-07-14 12:05:07 +1000) ---------------------------------------------------------------- xfs: lockless buffer cache lookups Current work to merge the XFS inode life cycle with the VFS inode life cycle is finding some interesting issues. If we have a path that hits buffer trylocks fairly hard (e.g. a non-blocking background inode freeing function), we end up hitting massive contention on the buffer cache hash locks: - 92.71% 0.05% [kernel] [k] xfs_inodegc_worker - 92.67% xfs_inodegc_worker - 92.13% xfs_inode_unlink - 91.52% xfs_inactive_ifree - 85.63% xfs_read_agi - 85.61% xfs_trans_read_buf_map - 85.59% xfs_buf_read_map - xfs_buf_get_map - 85.55% xfs_buf_find - 72.87% _raw_spin_lock - do_raw_spin_lock 71.86% __pv_queued_spin_lock_slowpath - 8.74% xfs_buf_rele - 7.88% _raw_spin_lock - 7.88% do_raw_spin_lock 7.63% __pv_queued_spin_lock_slowpath - 1.70% xfs_buf_trylock - 1.68% down_trylock - 1.41% _raw_spin_lock_irqsave - 1.39% do_raw_spin_lock __pv_queued_spin_lock_slowpath - 0.76% _raw_spin_unlock 0.75% do_raw_spin_unlock This is basically hammering the pag->pag_buf_lock from lots of CPUs doing trylocks at the same time. Most of the buffer trylock operations ultimately fail after we've done the lookup, so we're really hammering the buf hash lock whilst making no progress. We can also see significant spinlock traffic on the same lock just under normal operation when lots of tasks are accessing metadata from the same AG, so let's avoid all this by creating a lookup fast path which leverages the rhashtable's ability to do RCU protected lookups. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> ---------------------------------------------------------------- Dave Chinner (6): xfs: rework xfs_buf_incore() API xfs: break up xfs_buf_find() into individual pieces xfs: merge xfs_buf_find() and xfs_buf_get_map() xfs: reduce the number of atomic when locking a buffer after lookup xfs: remove a superflous hash lookup when inserting new buffers xfs: lockless buffer lookup fs/xfs/libxfs/xfs_attr_remote.c | 15 ++++-- fs/xfs/scrub/repair.c | 15 +++--- fs/xfs/xfs_buf.c | 263 +++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------ fs/xfs/xfs_buf.h | 21 ++++++-- fs/xfs/xfs_qm.c | 9 ++-- 5 files changed, 188 insertions(+), 135 deletions(-) -- Dave Chinner david@xxxxxxxxxxxxx