Re: [RFC PATCH 0/7] vfs: improving inode cache iteration scalability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 03, 2024 at 01:45:55PM +0200, Jan Kara wrote:
> Hi Dave!
> 
> On Wed 02-10-24 11:33:17, Dave Chinner wrote:
> > There are two superblock iterator functions provided. The first is a
> > generic iterator that provides safe, reference counted inodes for
> > the callback to operate on. This is generally what most sb->s_inodes
> > iterators use, and it allows the iterator to drop locks and perform
> > blocking operations on the inode before moving to the next inode in
> > the sb->s_inodes list.
> > 
> > There is one quirk to this interface - INO_ITER_REFERENCE - because
> > fsnotify iterates the inode cache -after- evict_inodes() has been
> > called during superblock shutdown to evict all non-referenced
> > inodes. Hence it should only find referenced inodes, and it has
> > a check to skip unreferenced inodes. This flag does the same.
> 
> Overall I really like the series. A lot of duplicated code removed and
> scalability improved, we don't get such deals frequently :) Regarding
> INO_ITER_REFERENCE I think that after commit 1edc8eb2e9313 ("fs: call
> fsnotify_sb_delete after evict_inodes") the check for 0 i_count in
> fsnotify_unmount_inodes() isn't that useful anymore so I'd be actually fine
> dropping it (as a separate patch please).
> 
> That being said I'd like to discuss one thing: As you have surely noticed,
> some of the places iterating inodes perform additional checks on the inode
> to determine whether the inode is interesting or not (e.g. the Landlock
> iterator or iterators in quota code) to avoid the unnecessary iget / iput
> and locking dance.

Yes, but we really don't care. None of these cases are performance
critical, and I'd much prefer that we have a consistent behaviour.

> The inode refcount check you've worked-around with
> INO_ITER_REFERENCE is a special case of that. Have you considered option to
> provide callback for the check inside the iterator?

I did. I decided that it wasn't necessary just to avoid the
occasional iget/iput. It's certainly not necessary for the
fsnotify/landlock cases where INO_ITER_REFERENCE was used because
at that point there are only landlock and fsnotify inodes left in
the cache. We're going to be doing iget/iput on all of them
anyway.

Really, subsystems should be tracking inodes they have references to
themselves, not running 'needle in haystack' searches for inodes
they hold references to. That would get rid of both the fsnotify and
landlock iterators completely...

> Also maybe I'm went a *bit* overboard here with macro magic but the code
> below should provide an iterator that you can use like:
> 
> 	for_each_sb_inode(sb, inode, inode_eligible_check(inode)) {
> 		do my stuff here
> 	}

As I explained to Kent: wrapping the existing code in a different
iterator defeats the entire purpose of the change to the iteration
code.

> that will avoid any indirect calls and will magically handle all the
> cleanup that needs to be done if you break / jump out of the loop or
> similar. I actually find such constructs more convenient to use than your
> version of the iterator because there's no need to create & pass around the
> additional data structure for the iterator body, no need for special return
> values to abort iteration etc.

I'm not introducing the callback-based iterator function to clean
the code up - I'm introducing it as infrastructure that allows the
*iteration mechanism to be completely replaced* by filesystems that
have more efficient, more scalable  inode iterators already built
in.

This change of iterator model also allows seamless transition of
indivudal filesystems to new iterator mechanisms. Macro based
iterators do not allow for different iterator implementations to
co-exist, but that's exactly what I'm trying to acheive here.
I'm not trying to clean the code up - I'm trying to lay the
ground-work for new functionality....


-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux