On Mon 14-11-16 17:09:47, Amir Goldstein wrote: > On Mon, Nov 14, 2016 at 3:20 PM, Jan Kara <jack@xxxxxxx> wrote: > > On Mon 14-11-16 13:48:27, Amir Goldstein wrote: > >> Handling fanotify events does not entail dereferencing fsnotify_mark > >> beyond the point of fanotify_should_send_event(). > >> > >> For the case of permission events, which may block indefinitely, > >> return -EAGAIN and then fsnotify() will call handle_event() again > >> without a reference to the mark. > >> > >> Without a reference to the mark, there is no need to call > >> handle_event() under fsnotify_mark_srcu[0] read side lock, > >> so we drop fsnotify_mark_srcu[0] while handling the event > >> and grab it back before continuing to the next mark. > >> > >> After this change, a blocking permission event will no longer > >> block closing of any file descriptors of 0 priority groups, > >> i.e: inotify and fanotify groups of class FAN_CLASS_NOTIF. > >> > >> Reported-by: Miklos Szeredi <miklos@xxxxxxxxxx> > >> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > Well, this has a similar problem as my attempt to fix the issue. The > > current mark can get removed from the mark list while waiting for userspace > > response. ->next pointer is still valid at that moment but ->next->pprev > > already points to mark preceding us (that's how rcu lists work). When > > ->next mark then gets removed from the list and destroyed (it may be > > protected by the second srcu), our ->next pointer points to freed memory. > > Oh! I missed the fact that the SRCU does not protect mark->obj_list. > Can resurrecting mark->free_list buy us anything? > Perhaps keep the marks on obj_list without FLAG_ATTACHED > and then remove marks from both free_list and obj_list post srcu_synchronize()? > I think that removing mark->obj_list was optimization of good faith > and not because it really hurts the system's memory usage that much? You have to be really careful here. Minimally you'd need then another srcu_synchronize() call after removing mark from the list and before freeing the mark so that you are sure no process iterating the list can have a reference to a mark being freed but even then it needs careful checking whether it would work. The joy of lockless algorithms... > > I have some ideas how to fix up issues with my refcounting approach - > > refcount would pin marks not only in memory but also in lists but I have > > yet to see whether that works out sensibly (it would mean that dropping > > mark reference would then need to take group->mark_mutex and that may cause > > lock ordering issues). > > It sounds like a chicken and egg problem, but I suppose you don't mean > the actual mark refcount, but a secondary "pinned refcount"? No, I mean the original refcount. I have patches that already postpone part of the mark cleanup to happen only once the refcount drops to 0. > Anyway, if you have something ready, my test setup is already in place.. Thanks for an offer. I don't have anything yet, hopefully later today or tomorrow. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html