On Wed 25-01-17 10:41:12, Miklos Szeredi wrote: > On Fri, Jan 20, 2017 at 2:21 PM, Jan Kara <jack@xxxxxxx> wrote: > > Currently notification marks are attached to object (inode or vfsmnt) by > > a hlist_head in the object. The list is also protected by a spinlock in > > the object. So while there is any mark attached to the list of marks, > > the object must be pinned in memory (and thus e.g. last iput() deleting > > inode cannot happen). Also for list iteration in fsnotify() to work, we > > must hold fsnotify_mark_srcu lock so that mark itself and > > mark->obj_list.next cannot get freed. Thus we are required to wait for > > response to fanotify events from userspace process with > > fsnotify_mark_srcu lock held. That causes issues when userspace process > > is buggy and does not reply to some event - basically the whole > > notification subsystem gets eventually stuck. > > > > So to be able to drop fsnotify_mark_srcu lock while waiting for > > response, we have to pin the mark in memory and make sure it stays in > > the object list (as removing the mark waiting for response could lead to > > lost notification events for groups later in the list). However we don't > > want inode reclaim to block on such mark as that would lead to system > > just locking up elsewhere. > > > > This commit tries to pave a way towards solving these conflicting > > lifetime needs. Instead of anchoring the list of marks directly in the > > object, we anchor it in a dedicated structure (fsnotify_mark_connector) and > > just point to that structure from the object. Also the list is protected > > by a spinlock contained in that structure. With this, we can detach > > notification marks from object without having to modify the list itself. > > I wonder if this big patch could be conveniently split up into > structural and functional parts. > > I.e. first add the connector structure but make its lifetime match > that of the inode. That would mean just adding the extra > dereferences, but not the tricky rcu-conformant creation and > destruction of the connecor. Yeah, probably I could split it into three parts. The addition of dynamically added (but freed only on object free) connector structure, the change of locking from i_lock/d_lock into connector->lock, the freeing of connector when the last mark on the list gets freed (RCU bits go there). > Also there seems to be a fair amount of moving code around, which > could warrant a separate patch for clarity. I don't think there's substantial moving of code. However the locking change above results in some functions becoming redundant so I just deleted them. After the split it should be easier to understand. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html