Re: [PATCH 1/2] xfs: separate read-only variables in struct xfs_mount

Chaitanya Kulkarni <Chaitanya.Kulkarni@xxxxxxx> · Wed, 20 May 2020 09:43:42 +0000



On 5/19/20 3:23 PM, Dave Chinner wrote:
> From: Dave Chinner<dchinner@xxxxxxxxxx>
> 
> Seeing massive cpu usage from xfs_agino_range() on one machine;
> instruction level profiles look similar to another machine running
> the same workload, only one machien is consuming 10x as much CPU as
's/machien/machine/', can be done at the time of applying patch.
> the other and going much slower. The only real difference between
> the two machines is core count per socket. Both are running
> identical 16p/16GB virtual machine configurations
> 
> Machine A:
> 
>    25.83%  [k] xfs_agino_range
>    12.68%  [k] __xfs_dir3_data_check
>     6.95%  [k] xfs_verify_ino
>     6.78%  [k] xfs_dir2_data_entry_tag_p
>     3.56%  [k] xfs_buf_find
>     2.31%  [k] xfs_verify_dir_ino
>     2.02%  [k] xfs_dabuf_map.constprop.0
>     1.65%  [k] xfs_ag_block_count
> 
> And takes around 13 minutes to remove 50 million inodes.
> 
> Machine B:
> 
>    13.90%  [k] __pv_queued_spin_lock_slowpath
>     3.76%  [k] do_raw_spin_lock
>     2.83%  [k] xfs_dir3_leaf_check_int
>     2.75%  [k] xfs_agino_range
>     2.51%  [k] __raw_callee_save___pv_queued_spin_unlock
>     2.18%  [k] __xfs_dir3_data_check
>     2.02%  [k] xfs_log_commit_cil
> 
> And takes around 5m30s to remove 50 million inodes.
> 
> Suspect is cacheline contention on m_sectbb_log which is used in one
> of the macros in xfs_agino_range. This is a read-only variable but
> shares a cacheline with m_active_trans which is a global atomic that
> gets bounced all around the machine.
> 
> The workload is trying to run hundreds of thousands of transactions
> per second and hence cacheline contention will be occuring on this
's/occuring/occurring/', can be done at the time of applying patch.
> atomic counter. Hence xfs_agino_range() is likely just be an
> innocent bystander as the cache coherency protocol fights over the
> cacheline between CPU cores and sockets.
> 
> On machine A, this rearrangement of the struct xfs_mount
> results in the profile changing to:
> 
>     9.77%  [kernel]  [k] xfs_agino_range
>     6.27%  [kernel]  [k] __xfs_dir3_data_check
>     5.31%  [kernel]  [k] __pv_queued_spin_lock_slowpath
>     4.54%  [kernel]  [k] xfs_buf_find
>     3.79%  [kernel]  [k] do_raw_spin_lock
>     3.39%  [kernel]  [k] xfs_verify_ino
>     2.73%  [kernel]  [k] __raw_callee_save___pv_queued_spin_unlock
> 
> Vastly less CPU usage in xfs_agino_range(), but still 3x the amount
> of machine B and still runs substantially slower than it should.
> 
> Current rm -rf of 50 million files:
> 
> 		vanilla		patched
> machine A	13m20s		6m42s
> machine B	5m30s		5m02s
> 
> It's an improvement, hence indicating that separation and further
> optimisation of read-only global filesystem data is worthwhile, but
> it clearly isn't the underlying issue causing this specific
> performance degradation.
> 
> Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx>
> ---