On Tue, 2021-06-01 at 15:18 +0200, Miklos Szeredi wrote: > On Fri, 28 May 2021 at 08:34, Ian Kent <raven@xxxxxxxxxx> wrote: > > > > The inode operations .permission() and .getattr() use the kernfs > > node > > write lock but all that's needed is to keep the rb tree stable > > while > > updating the inode attributes as well as protecting the update > > itself > > against concurrent changes. > > > > And .permission() is called frequently during path walks and can > > cause > > quite a bit of contention between kernfs node operations and path > > walks when the number of concurrent walks is high. > > > > To change kernfs_iop_getattr() and kernfs_iop_permission() to take > > the rw sem read lock instead of the write lock an additional lock > > is > > needed to protect against multiple processes concurrently updating > > the inode attributes and link count in kernfs_refresh_inode(). > > > > The inode i_lock seems like the sensible thing to use to protect > > these > > inode attribute updates so use it in kernfs_refresh_inode(). > > > > Signed-off-by: Ian Kent <raven@xxxxxxxxxx> > > --- > > fs/kernfs/inode.c | 10 ++++++---- > > fs/kernfs/mount.c | 4 ++-- > > 2 files changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c > > index 3b01e9e61f14e..6728ecd81eb37 100644 > > --- a/fs/kernfs/inode.c > > +++ b/fs/kernfs/inode.c > > @@ -172,6 +172,7 @@ static void kernfs_refresh_inode(struct > > kernfs_node *kn, struct inode *inode) > > { > > struct kernfs_iattrs *attrs = kn->iattr; > > > > + spin_lock(&inode->i_lock); > > inode->i_mode = kn->mode; > > if (attrs) > > /* > > @@ -182,6 +183,7 @@ static void kernfs_refresh_inode(struct > > kernfs_node *kn, struct inode *inode) > > > > if (kernfs_type(kn) == KERNFS_DIR) > > set_nlink(inode, kn->dir.subdirs + 2); > > + spin_unlock(&inode->i_lock); > > } > > > > int kernfs_iop_getattr(struct user_namespace *mnt_userns, > > @@ -191,9 +193,9 @@ int kernfs_iop_getattr(struct user_namespace > > *mnt_userns, > > struct inode *inode = d_inode(path->dentry); > > struct kernfs_node *kn = inode->i_private; > > > > - down_write(&kernfs_rwsem); > > + down_read(&kernfs_rwsem); > > kernfs_refresh_inode(kn, inode); > > - up_write(&kernfs_rwsem); > > + up_read(&kernfs_rwsem); > > > > generic_fillattr(&init_user_ns, inode, stat); > > return 0; > > @@ -284,9 +286,9 @@ int kernfs_iop_permission(struct user_namespace > > *mnt_userns, > > > > kn = inode->i_private; > > > > - down_write(&kernfs_rwsem); > > + down_read(&kernfs_rwsem); > > kernfs_refresh_inode(kn, inode); > > - up_write(&kernfs_rwsem); > > + up_read(&kernfs_rwsem); > > > > return generic_permission(&init_user_ns, inode, mask); > > } > > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c > > index baa4155ba2edf..f2f909d09f522 100644 > > --- a/fs/kernfs/mount.c > > +++ b/fs/kernfs/mount.c > > @@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block > > *sb, struct kernfs_fs_context *k > > sb->s_shrink.seeks = 0; > > > > /* get root inode, initialize and unlock it */ > > - down_write(&kernfs_rwsem); > > + down_read(&kernfs_rwsem); > > inode = kernfs_get_inode(sb, info->root->kn); > > - up_write(&kernfs_rwsem); > > + up_read(&kernfs_rwsem); > > if (!inode) { > > pr_debug("kernfs: could not get root inode\n"); > > return -ENOMEM; > > > > This last hunk is not mentioned in the patch header. Why is this > needed? Yes, that's right. The lock is needed to keep the node rb tree stable. kernfs_get_inode() calls kernfs_refresh_inode() indirectly so since the i_lock is probably not needed here this hunk could just as well have gone into the rwsem change but because of that kernfs_refresh_inode() call it also makes sense to put it here. I'd prefer to keep it here and clearly what's going on isn't as obvious as I thought so I can add this reasoning to the description if you still think it's worth while? > > Otherwise looks good. > > Thanks, > Miklos