Re: [RFC PATCH 0/7] vfs: improving inode cache iteration scalability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave!

On Wed 02-10-24 11:33:17, Dave Chinner wrote:
> There are two superblock iterator functions provided. The first is a
> generic iterator that provides safe, reference counted inodes for
> the callback to operate on. This is generally what most sb->s_inodes
> iterators use, and it allows the iterator to drop locks and perform
> blocking operations on the inode before moving to the next inode in
> the sb->s_inodes list.
> 
> There is one quirk to this interface - INO_ITER_REFERENCE - because
> fsnotify iterates the inode cache -after- evict_inodes() has been
> called during superblock shutdown to evict all non-referenced
> inodes. Hence it should only find referenced inodes, and it has
> a check to skip unreferenced inodes. This flag does the same.

Overall I really like the series. A lot of duplicated code removed and
scalability improved, we don't get such deals frequently :) Regarding
INO_ITER_REFERENCE I think that after commit 1edc8eb2e9313 ("fs: call
fsnotify_sb_delete after evict_inodes") the check for 0 i_count in
fsnotify_unmount_inodes() isn't that useful anymore so I'd be actually fine
dropping it (as a separate patch please).

That being said I'd like to discuss one thing: As you have surely noticed,
some of the places iterating inodes perform additional checks on the inode
to determine whether the inode is interesting or not (e.g. the Landlock
iterator or iterators in quota code) to avoid the unnecessary iget / iput
and locking dance. The inode refcount check you've worked-around with
INO_ITER_REFERENCE is a special case of that. Have you considered option to
provide callback for the check inside the iterator?

Also maybe I'm went a *bit* overboard here with macro magic but the code
below should provide an iterator that you can use like:

	for_each_sb_inode(sb, inode, inode_eligible_check(inode)) {
		do my stuff here
	}

that will avoid any indirect calls and will magically handle all the
cleanup that needs to be done if you break / jump out of the loop or
similar. I actually find such constructs more convenient to use than your
version of the iterator because there's no need to create & pass around the
additional data structure for the iterator body, no need for special return
values to abort iteration etc.

								Honza

/* Find next inode on the inode list eligible for processing */
#define sb_inode_iter_next(sb, inode, old_inode, inode_eligible) 	\
({									\
	struct inode *ret = NULL;					\
									\
	cond_resched();							\
	spin_lock(&(sb)->s_inode_list_lock);				\
	if (!(inode))							\
		inode = list_first_entry((sb)->s_inodes, struct inode,	\
					 i_sb_list);			\
	while (1) {							\
		if (list_entry_is_head(inode, (sb)->s_inodes, i_sb_list)) { \
			spin_unlock(&(sb)->s_inode_list_lock);		\
			break;						\
		}							\
		spin_lock(&inode->i_lock);				\
		if ((inode)->i_state & (I_NEW | I_FREEING | I_WILL_FREE) || \
		    !inode_eligible) {					\
			spin_unlock(&(inode)->i_lock);			\
			continue;					\
		}							\
		__iget(inode);						\
		spin_unlock(&(inode)->i_lock);				\
		spin_unlock(&(sb)->s_inode_list_lock);			\
		iput(*old_inode);					\
		*old_inode = inode;					\
		ret = inode;						\
		break;							\
	}								\
	ret;								\
})

#define for_each_sb_inode(sb, inode, inode_eligible)			\
	for (DEFINE_FREE(old_inode, struct inode *, if (_T) iput(_T)),	\
	     inode = NULL;						\
	     inode = sb_inode_iter_next((sb), inode, &old_inode,	\
					 inode_eligible);		\
	    )
	     

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux