Re: [PATCH 1/2] kernfs: dont take i_lock on inode attr read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28/7/23 08:00, Ian Kent wrote:
On 27/7/23 12:30, Imran Khan wrote:
Hello Ian,
Sorry for late reply. I was about to reply this week.

On 27/7/2023 10:38 am, Ian Kent wrote:
On 20/7/23 10:03, Ian Kent wrote:
On Wed, 2023-07-19 at 12:23 +0800, Ian Kent wrote:
[...]
I do see a problem with recent changes.

I'll send this off to Greg after I've done some testing (primarily just
compile and function).

Here's a patch which describes what I found.

Comments are, of course, welcome, ;)
Anders I was hoping you would check if/what lockdep trace

you get with this patch.


Imran, I was hoping you would comment on my change as it

relates to the kernfs_iattr_rwsem changes.


Ian

kernfs: fix missing kernfs_iattr_rwsem locking

From: Ian Kent <raven@xxxxxxxxxx>

When the kernfs_iattr_rwsem was introduced a case was missed.

The update of the kernfs directory node child count was also protected
by the kernfs_rwsem and needs to be included in the change so that the
child count (and so the inode n_link attribute) does not change while
holding the rwsem for read.

kernfs direcytory node's child count changes in kernfs_(un)link_sibling and
these are getting invoked while adding (kernfs_add_one),
removing(__kernfs_remove) or moving (kernfs_rename_ns)a node. Each of these
operations proceed under kernfs_rwsem and I see each invocation of
kernfs_link/unlink_sibling during the above mentioned operations is happening
under kernfs_rwsem.
So the child count should still be protected by kernfs_rwsem and we should not
need to acquire kernfs_iattr_rwsem in kernfs_link/unlink_sibling.

Yes, that's exactly what I intended (assuming you mean write lock in those cases)

when I did it so now I wonder what I saw that lead to my patch, I'll need to look

again ...

Ahh, I see why I thought this ...

It's the hunk:

@@ -285,10 +285,10 @@ int kernfs_iop_permission(struct mnt_idmap *idmap,
        kn = inode->i_private;
        root = kernfs_root(kn);

-       down_read(&root->kernfs_rwsem);
+       down_read(&root->kernfs_iattr_rwsem);
        kernfs_refresh_inode(kn, inode);
        ret = generic_permission(&nop_mnt_idmap, inode, mask);
-       up_read(&root->kernfs_rwsem);
+       up_read(&root->kernfs_iattr_rwsem);

        return ret;
 }

which takes away the kernfs_rwsem and introduces the possibility of

the change. It may be more instructive to add back taking the read

lock of kernfs_rwsem in .permission() than altering the sibling link

and unlink functions, I mean I even caught myself on it.


Ian




Kindly let me know your thoughts. I would still like to see new lockdep traces
with this change.

Indeed, I hope Anders can find time to get the trace.


Ian


Thanks,
Imran

Fixes: 9caf696142 (kernfs: Introduce separate rwsem to protect inode
attributes)

Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
Cc: Anders Roxell <anders.roxell@xxxxxxxxxx>
Cc: Imran Khan <imran.f.khan@xxxxxxxxxx>
Cc: Arnd Bergmann <arnd@xxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Eric Sandeen <sandeen@xxxxxxxxxxx>
---
   fs/kernfs/dir.c |    4 ++++
   1 file changed, 4 insertions(+)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 45b6919903e6..6e84bb69602e 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -383,9 +383,11 @@ static int kernfs_link_sibling(struct kernfs_node
*kn)
       rb_insert_color(&kn->rb, &kn->parent->dir.children);
         /* successfully added, account subdir number */
+ down_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
       if (kernfs_type(kn) == KERNFS_DIR)
           kn->parent->dir.subdirs++;
       kernfs_inc_rev(kn->parent);
+    up_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
         return 0;
   }
@@ -408,9 +410,11 @@ static bool kernfs_unlink_sibling(struct
kernfs_node *kn)
       if (RB_EMPTY_NODE(&kn->rb))
           return false;
   + down_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
       if (kernfs_type(kn) == KERNFS_DIR)
           kn->parent->dir.subdirs--;
       kernfs_inc_rev(kn->parent);
+    up_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
         rb_erase(&kn->rb, &kn->parent->dir.children);
       RB_CLEAR_NODE(&kn->rb);




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux