Re: [PATCH 22/32] vfs: inode cache conversion to hash-bl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 16, 2023 at 12:17:04PM -0400, Kent Overstreet wrote:
> On Tue, May 16, 2023 at 05:45:19PM +0200, Christian Brauner wrote:
> > On Wed, May 10, 2023 at 02:45:57PM +1000, Dave Chinner wrote:
> > There's a bit of a backlog before I get around to looking at this but
> > it'd be great if we'd have a few reviewers for this change.
> 
> It is well tested - it's been in the bcachefs tree for ages with zero
> issues. I'm pulling it out of the bcachefs-prerequisites series though
> since Dave's still got it in his tree, he's got a newer version with
> better commit messages.
> 
> It's a significant performance boost on metadata heavy workloads for any
> non-XFS filesystem, we should definitely get it in.

I've got an up to date vfs-scale tree here (6.4-rc1) but I have not
been able to test it effectively right now because my local
performance test server is broken. I'll do what I can on the old
small machine that I have to validate it when I get time, but that
might be a few weeks away....

git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git vfs-scale

As it is, the inode hash-bl changes have zero impact on XFS because
it has it's own highly scalable lockless, sharded inode cache. So
unless I'm explicitly testing ext4 or btrfs scalability (rare) it's
not getting a lot of scalability exercise. It is being used by the
root filesytsems on all those test VMs, but that's about it...

That said, my vfs-scale tree also has Waiman Long's old dlist code
(per cpu linked list) which converts the sb inode list and removes
the global lock there. This does make a huge impact for XFS - the
current code limits inode cache cycling to about 600,000 inodes/sec
on >=16p machines. With dlists, however:

| 5.17.0 on a XFS filesystem with 50 million inodes in it on a 32p
| machine with a 1.6MIOPS/6.5GB/s block device.
| 
| Fully concurrent full filesystem bulkstat:
| 
| 		wall time	sys time	IOPS	BW	rate
| unpatched:	1m56.035s	56m12.234s	 8k     200MB/s	0.4M/s
| patched:	0m15.710s	 3m45.164s	70k	1.9GB/s 3.4M/s
| 
| Unpatched flat kernel profile:
| 
|   81.97%  [kernel]  [k] __pv_queued_spin_lock_slowpath
|    1.84%  [kernel]  [k] do_raw_spin_lock
|    1.33%  [kernel]  [k] __raw_callee_save___pv_queued_spin_unlock
|    0.50%  [kernel]  [k] memset_erms
|    0.42%  [kernel]  [k] do_raw_spin_unlock
|    0.42%  [kernel]  [k] xfs_perag_get
|    0.40%  [kernel]  [k] xfs_buf_find
|    0.39%  [kernel]  [k] __raw_spin_lock_init
| 
| Patched flat kernel profile:
| 
|   10.90%  [kernel]  [k] do_raw_spin_lock
|    7.21%  [kernel]  [k] __raw_callee_save___pv_queued_spin_unlock
|    3.16%  [kernel]  [k] xfs_buf_find
|    3.06%  [kernel]  [k] rcu_segcblist_enqueue
|    2.73%  [kernel]  [k] memset_erms
|    2.31%  [kernel]  [k] __pv_queued_spin_lock_slowpath
|    2.15%  [kernel]  [k] __raw_spin_lock_init
|    2.15%  [kernel]  [k] do_raw_spin_unlock
|    2.12%  [kernel]  [k] xfs_perag_get
|    1.93%  [kernel]  [k] xfs_btree_lookup

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux