On Wed, Mar 15, 2017 at 12:46 PM, Jan Kara <jack@xxxxxxx> wrote: > Currently notification marks are attached to object (inode or vfsmnt) by > a hlist_head in the object. The list is also protected by a spinlock in > the object. So while there is any mark attached to the list of marks, > the object must be pinned in memory (and thus e.g. last iput() deleting > inode cannot happen). Also for list iteration in fsnotify() to work, we > must hold fsnotify_mark_srcu lock so that mark itself and > mark->obj_list.next cannot get freed. Thus we are required to wait for > response to fanotify events from userspace process with > fsnotify_mark_srcu lock held. That causes issues when userspace process > is buggy and does not reply to some event - basically the whole > notification subsystem gets eventually stuck. > > So to be able to drop fsnotify_mark_srcu lock while waiting for > response, we have to pin the mark in memory and make sure it stays in > the object list (as removing the mark waiting for response could lead to > lost notification events for groups later in the list). However we don't > want inode reclaim to block on such mark as that would lead to system > just locking up elsewhere. > > This commit is the first in the series that paves way towards solving > these conflicting lifetime needs. Instead of anchoring the list of marks > directly in the object, we anchor it in a dedicated structure > (fsnotify_mark_connector) and just point to that structure from the > object. The following commits will also add spinlock protecting the list > and object pointer to the structure. > > Signed-off-by: Jan Kara <jack@xxxxxxx> This looks very clean. One minor suggestion. Feel free to dismiss it. Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx> [...] > +void fsnotify_connector_free(struct fsnotify_mark_connector **connp) > +{ > + if (*connp) { > + kmem_cache_free(fsnotify_mark_connector_cachep, *connp); > + *connp = NULL; > + } > +} > + > void fsnotify_set_mark_mask_locked(struct fsnotify_mark *mark, __u32 mask) > { > assert_spin_locked(&mark->lock); > @@ -304,21 +322,43 @@ int fsnotify_compare_groups(struct fsnotify_group *a, struct fsnotify_group *b) > return -1; > } > > -/* Add mark into proper place in given list of marks */ > -int fsnotify_add_mark_list(struct hlist_head *head, struct fsnotify_mark *mark, > - int allow_dups) > +/* > + * Add mark into proper place in given list of marks. These marks may be used > + * for the fsnotify backend to determine which event types should be delivered > + * to which group and for which inodes. These marks are ordered according to > + * priority, highest number first, and then by the group's location in memory. > + */ > +int fsnotify_add_mark_list(struct fsnotify_mark_connector **connp, > + struct fsnotify_mark *mark, int allow_dups) > { > struct fsnotify_mark *lmark, *last = NULL; > + struct fsnotify_mark_connector *conn; > int cmp; > > + if (!*connp) { > + conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, > + GFP_ATOMIC); > + if (!conn) > + return -ENOMEM; > + INIT_HLIST_HEAD(&conn->list); > + /* > + * Make sure 'conn' initialization is visible. Matches > + * lockless_dereference() in fsnotify(). > + */ > + smp_wmb(); > + *connp = conn; > + } else { > + conn = *connp; > + } > + This chunk would look nicer in fsnotify_connector_get() helper.