On Mon, 2021-06-14 at 14:52 +0800, Ian Kent wrote: > On Mon, 2021-06-14 at 09:32 +0800, Ian Kent wrote: > > On Sat, 2021-06-12 at 01:45 +0000, Al Viro wrote: > > > On Wed, Jun 09, 2021 at 04:51:22PM +0800, Ian Kent wrote: > > > > The inode operations .permission() and .getattr() use the > > > > kernfs > > > > node > > > > write lock but all that's needed is to keep the rb tree stable > > > > while > > > > updating the inode attributes as well as protecting the update > > > > itself > > > > against concurrent changes. > > > > > > Huh? Where does it access the rbtree at all? Confused... > > > > > > > diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c > > > > index 3b01e9e61f14e..6728ecd81eb37 100644 > > > > --- a/fs/kernfs/inode.c > > > > +++ b/fs/kernfs/inode.c > > > > @@ -172,6 +172,7 @@ static void kernfs_refresh_inode(struct > > > > kernfs_node *kn, struct inode *inode) > > > > { > > > > struct kernfs_iattrs *attrs = kn->iattr; > > > > > > > > + spin_lock(&inode->i_lock); > > > > inode->i_mode = kn->mode; > > > > if (attrs) > > > > /* > > > > @@ -182,6 +183,7 @@ static void kernfs_refresh_inode(struct > > > > kernfs_node *kn, struct inode *inode) > > > > > > > > if (kernfs_type(kn) == KERNFS_DIR) > > > > set_nlink(inode, kn->dir.subdirs + 2); > > > > + spin_unlock(&inode->i_lock); > > > > } > > > > > > Even more so - just what are you serializing here? That code > > > synchronizes inode > > > metadata with those in kernfs_node. Suppose you've got two > > > threads > > > doing > > > ->permission(); the first one gets through kernfs_refresh_inode() > > > and > > > goes into > > > generic_permission(). No locks are held, so > > > kernfs_refresh_inode() > > > from another > > > thread can run in parallel with generic_permission(). > > > > > > If that's not a problem, why two kernfs_refresh_inode() done in > > > parallel would > > > be a problem? > > > > > > Thread 1: > > > permission > > > done refresh, all locks released now > > > Thread 2: > > > change metadata in kernfs_node > > > Thread 2: > > > permission > > > goes into refresh, copying metadata into inode > > > Thread 1: > > > generic_permission() > > > No locks in common between the last two operations, so > > > we generic_permission() might see partially updated metadata. > > > Either we don't give a fuck (in which case I don't understand > > > what purpose does that ->i_lock serve) *or* we need the exclusion > > > to cover a wider area. > > > > This didn't occur to me, obviously. > > > > It seems to me this can happen with the original code too although > > using a mutex might reduce the likelihood of it happening. > > > > Still ->permission() is meant to be a read-only function so the VFS > > shouldn't need to care about it. > > > > Do you have any suggestions on how to handle this. > > Perhaps the only way is to ensure the inode is updated only in > > functions that are expected to do this. > > IIRC Greg and Tejun weren't averse to adding a field to the > struct kernfs_iattrs, but there were concerns about increasing > memory usage. > > Because of this I think the best way to handle this would be to > broaden the scope of the i_lock to cover the generic calls in > kernfs_iop_getattr() and kernfs_iop_permission(). The only other > call to kernfs_refresh_inode() is at inode initialization and > then only for I_NEW inodes so that should be ok. Also both > generic_permission() and generic_fillattr() are reading from the > inode so not likely to need to take the i_lock any time soon (is > this a reasonable assumption Al?). > > Do you think this is a sensible way to go Al? Unless of course we don't care about taking a lock here at all, Greg, Tejun? Ian