On Mon 26-12-16 16:15:23, Amir Goldstein wrote: > On Thu, Dec 22, 2016 at 11:15 AM, Jan Kara <jack@xxxxxxx> wrote: > > Currently we queue mark into a list of marks for destruction in > > __fsnotify_free_mark() and keep last mark reference dangling. After the > > worker waits for SRCU period, it drops the last reference to the mark > > which frees it. This scheme has the disadvantage that if we hold > > reference to mark and drop and reacquire SRCU lock, the mark can get > > freed immediately which is slightly inconvenient and we will need to > > avoid this in the future. > > > > Move to a scheme where queueing of mark into a list of marks for > > destruction happens when the last reference to the mark is dropped. Also > > drop reference to the mark held by group list already when mark is > > removed from that list instead of dropping it only from the destruction > > worker. > > > > The BEFORE section refers to what SRCU protects, which this patch > slightly changes. Can you please add to AFTER section, what is protected > by SRCU after this patch. Ah, the changelog is a victim of me reshuffling patches. It was written in a situation when mark reference did not protect from any list removal yet. I'll rewrite it. Thanks for noticing. > IIUC, SRCU protects from freeing the mark, > but it does not protect from removing mark from group list, so after > drop and reacquire SRCU with elevated mark refcount, mark can find > itself not on any list. You are correct that SRCU only protects from freeing the mark and it also makes sure the inode / vfsmount list traversal is fine (because of using _rcu list primitives during list manipulation) although the mark can be removed from that list in parallel. It does not give any guarantee for group list as you correctly note. But mark reference pins mark in the inode / vfsmount list (after patch 10), so once we have that reference we are sure mark stays in inode / vfsmount list. > For example in inotify_handle_event() when calling fsnotify_destroy_mark() > for IN_ONESHOT mark. So in that place we don't hold any mark reference ourselves so mark can indeed get removed from all the lists - but we still hold SRCU lock so we are sure mark cannot get freed and inode list traversal can still continue. > > @@ -537,20 +527,13 @@ int fsnotify_add_mark_locked(struct fsnotify_mark *mark, > > > > return ret; > > err: > > - mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE; > > + mark->flags &= ~(FSNOTIFY_MARK_FLAG_ALIVE | > > + FSNOTIFY_MARK_FLAG_ATTACHED); > > list_del_init(&mark->g_list); > > - fsnotify_put_group(group); > > fsnotify_put_group() is removed by this patch here and not added anywhere else > nor is any fsnotify_get_group() call removed. Is this a leak or maybe > fixed later on?? Note that I also removed mark->group = NULL below. So fsnotify_put_mark() and followup mark destruction will take care of properly freeing group reference. There's no reason to special-case this in fsnotify_add_mark_locked()... > > > - mark->group = NULL; > > atomic_dec(&group->num_marks); > > - > > spin_unlock(&mark->lock); > > > > - spin_lock(&destroy_lock); > > - list_add(&mark->g_list, &destroy_list); > > - spin_unlock(&destroy_lock); > > - queue_delayed_work(system_unbound_wq, &reaper_work, > > - FSNOTIFY_REAPER_DELAY); > > - > > + fsnotify_put_mark(mark); > > return ret; > > } Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html