On Thu, Oct 18, 2018 at 1:22 PM Jan Kara <jack@xxxxxxx> wrote: > > Detaching of mark connector from fsnotify_put_mark() can race with > unmounting of the filesystem like: > > CPU1 CPU2 > fsnotify_put_mark() > spin_lock(&conn->lock); > ... > inode = fsnotify_detach_connector_from_object(conn) > spin_unlock(&conn->lock); > generic_shutdown_super() > fsnotify_unmount_inodes() > sees connector detached for inode > -> nothing to do > evict_inode() > barfs on pending inode reference > iput(inode); > > Resulting in "Busy inodes after unmount" message and possible kernel > oops. Make fsnotify_unmount_inodes() properly wait for outstanding inode > references from detached connectors. > > Note that the accounting of outstanding inode references in the > superblock can cause some cacheline contention on the counter. OTOH it > happens only during deletion of the last notification mark from an inode > (or during unlinking of watched inode) and that is not too bad. I have > measured time to create & delete inotify watch 100000 times from 64 > processes in parallel (each process having its own inotify group and its > own file on a shared superblock) on a 64 CPU machine. Average and > standard deviation of 15 runs look like: > > Avg Stddev > Vanilla 9.817400 0.276165 > Fixed 9.710467 0.228294 > > So there's no statistically significant difference. > Fixes: ??? > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- > fs/notify/fsnotify.c | 3 +++ > fs/notify/mark.c | 31 ++++++++++++++++++++++++------- > include/linux/fs.h | 3 +++ > 3 files changed, 30 insertions(+), 7 deletions(-) > > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c > index f174397b63a0..00d4f4357724 100644 > --- a/fs/notify/fsnotify.c > +++ b/fs/notify/fsnotify.c > @@ -96,6 +96,9 @@ void fsnotify_unmount_inodes(struct super_block *sb) > > if (iput_inode) > iput(iput_inode); > + /* Wait for outstanding inode references from connectors */ > + wait_var_event(&sb->s_fsnotify_inode_refs, > + !atomic_long_read(&sb->s_fsnotify_inode_refs)); > } > > /* > diff --git a/fs/notify/mark.c b/fs/notify/mark.c > index 59cdb27826de..61c58e1759ee 100644 > --- a/fs/notify/mark.c > +++ b/fs/notify/mark.c > @@ -179,7 +179,7 @@ static void fsnotify_connector_destroy_workfn(struct work_struct *work) > } > } > > -static struct inode *fsnotify_detach_connector_from_object( > +static void *fsnotify_detach_connector_from_object( > struct fsnotify_mark_connector *conn) > { > struct inode *inode = NULL; > @@ -190,6 +190,7 @@ static struct inode *fsnotify_detach_connector_from_object( > if (conn->type == FSNOTIFY_OBJ_TYPE_INODE) { > inode = fsnotify_conn_inode(conn); > inode->i_fsnotify_mask = 0; > + atomic_long_inc(&inode->i_sb->s_fsnotify_inode_refs); > } else if (conn->type == FSNOTIFY_OBJ_TYPE_VFSMOUNT) { > fsnotify_conn_mount(conn)->mnt_fsnotify_mask = 0; > } > @@ -211,10 +212,26 @@ static void fsnotify_final_mark_destroy(struct fsnotify_mark *mark) > fsnotify_put_group(group); > } > > +/* Drop object reference originally held by a connector */ > +static void fsnotify_drop_object(void *objp) > +{ > + struct inode *inode; > + struct super_block *sb; > + > + if (!objp) > + return; > + /* Currently only inode references are passed to be dropped */ This patch would have been a whole lot shorter if you just implemented fsnotify_drop_inode(struct inode *inode) Did you have anything specific in mind when you made it generic? Anyway, if you do choose to leave this function "generic", please pass in conn->type to this function and do "the inode thing" only for type == FSNOTIFY_OBJ_TYPE_INODE. Thanks, Amir.