Re: [GIT PULL] bcachefs changes for 6.12-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 24, 2024 at 01:34:14PM GMT, Dave Chinner wrote:
> On Mon, Sep 23, 2024 at 10:55:57PM -0400, Kent Overstreet wrote:
> > But stat/statx always pulls into the vfs inode cache, and that's likely
> > worth fixing.
> 
> No, let's not even consider going there.
> 
> Unlike most people, old time XFS developers have direct experience
> with the problems that "uncached" inode access for stat purposes.
> 
> XFS has had the bulkstat API for a long, long time (i.e. since 1998
> on Irix). When it was first implemented on Irix, it was VFS cache
> coherent. But in the early 2000s, that caused problems with HSMs
> needing to scan billions inodes indexing petabytes of stored data
> with certain SLA guarantees (i.e. needing to scan at least a million
> inodes a second).  The CPU overhead of cache instantiation and
> teardown was too great to meet those performance targets on 500MHz
> MIPS CPUs.
> 
> So we converted bulkstat to run directly out of the XFS buffer cache
> (i.e. uncached from the perspective of the VFS). This reduced the
> CPU over per-inode substantially, allowing bulkstat rates to
> increase by a factor of 10. However, it introduced all sorts of
> coherency problems between cached inode state vs what was stored in
> the buffer cache. It was basically O_DIRECT for stat() and, as you'd
> expect from that description, the coherency problems were horrible.
> Detecting iallocated-but-not-yet-updated and
> unlinked-but-not-yet-freed inodes were particularly consistent
> sources of issues.
> 
> The only way to fix these coherency problems was to check the inode
> cache for a resident inode first, which basically defeated the
> entire purpose of bypassing the VFS cache in the first place.

Eh? Of course it'd have to be coherent, but just checking if an inode is
present in the VFS cache is what, 1-2 cache misses? Depending on hash
table fill factor...

That's going to show up, but I have a hard time seeing that "defeating
the entire purpose" of bypassing the VFS cache, as you say.

> Don't hack around VFS scalability issues if it can be avoided.

Well, maybe if your dlock list patches make it in - I still see crazy
lock contention there...




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux