On Wed 15-03-17 16:11:17, Amir Goldstein wrote: > On Wed, Mar 15, 2017 at 12:46 PM, Jan Kara <jack@xxxxxxx> wrote: > > Currently notification marks are attached to object (inode or vfsmnt) by > > a hlist_head in the object. The list is also protected by a spinlock in > > the object. So while there is any mark attached to the list of marks, > > the object must be pinned in memory (and thus e.g. last iput() deleting > > inode cannot happen). Also for list iteration in fsnotify() to work, we > > must hold fsnotify_mark_srcu lock so that mark itself and > > mark->obj_list.next cannot get freed. Thus we are required to wait for > > response to fanotify events from userspace process with > > fsnotify_mark_srcu lock held. That causes issues when userspace process > > is buggy and does not reply to some event - basically the whole > > notification subsystem gets eventually stuck. > > > > So to be able to drop fsnotify_mark_srcu lock while waiting for > > response, we have to pin the mark in memory and make sure it stays in > > the object list (as removing the mark waiting for response could lead to > > lost notification events for groups later in the list). However we don't > > want inode reclaim to block on such mark as that would lead to system > > just locking up elsewhere. > > > > This commit is the first in the series that paves way towards solving > > these conflicting lifetime needs. Instead of anchoring the list of marks > > directly in the object, we anchor it in a dedicated structure > > (fsnotify_mark_connector) and just point to that structure from the > > object. The following commits will also add spinlock protecting the list > > and object pointer to the structure. > > > > Signed-off-by: Jan Kara <jack@xxxxxxx> > > This looks very clean. > One minor suggestion. Feel free to dismiss it. > > Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx> Thanks. > > +void fsnotify_connector_free(struct fsnotify_mark_connector **connp) > > +{ > > + if (*connp) { > > + kmem_cache_free(fsnotify_mark_connector_cachep, *connp); > > + *connp = NULL; > > + } > > +} > > + > > void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask) > > { > > assert_spin_locked(&mark->lock); > > @@ -304,21 +322,43 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b) > > return -1; > > } > > > > -/* Add mark into proper place in given list of marks */ > > -int fsnotify_add_mark_list(struct hlist_head *head, struct fsnotify_mark *mark, > > - int allow_dups) > > +/* > > + * Add mark into proper place in given list of marks. These marks may be used > > + * for the fsnotify backend to determine which event types should be delivered > > + * to which group and for which inodes. These marks are ordered according to > > + * priority, highest number first, and then by the group's location in memory. > > + */ > > +int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp, > > + struct fsnotify_mark *mark, int allow_dups) > > { > > struct fsnotify_mark *lmark, *last = NULL; > > + struct fsnotify_mark_connector *conn; > > int cmp; > > > > + if (!*connp) { > > + conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, > > + GFP_ATOMIC); > > + if (!conn) > > + return -ENOMEM; > > + INIT_HLIST_HEAD(&conn->list); > > + /* > > + * Make sure 'conn' initialization is visible. Matches > > + * lockless_dereference() in fsnotify(). > > + */ > > + smp_wmb(); > > + *connp = conn; > > + } else { > > + conn = *connp; > > + } > > + > > This chunk would look nicer in fsnotify_connector_get() helper. Good point. I've called the function fsnotify_grab_create_connector() to make it consistent with later fsnotify_grab_connector() and to stress clearly that this may allocate new connector unlike later fsnotify_grab_connector()... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR