> On Mon 22-06-15 16:23:16, Ashish Sangwan wrote: > > For deleting the fsnotify_mark related with an inode, there are 2 paths in the > > kernel. When the inotify fd is closed, all the marks belonging to a group are > > removed one by one in fsnotify_clear_marks_by_group_flags. Other path is when > > the inode is removed from user space by unlink, fsnotify_destroy_mark is > > called to delete a single mark. > > There is a race between these 2 paths which is caused due to the temporary > > release of the mark_mutex inside fsnotify_destroy_mark_locked. > > The race happen when the inotify app monitoring the file(s) exits, triggering > > fsnotify_clear_marks_by_group_flags to delete the marks. > > This function use lmark pointer to move to the next node after a safe removal > > of the node. In parallel, if there is rm call for a file and such that the > > lmark is pointing to the mark which is removed by this rm call, lmark ends up > > pointing to a freed memory. Now, when we try to move to the next node using > > lmark, it triggers an invalid virtual address crash. > > Although fsnotify_clear_marks_by_group_flags and fsnotify_destroy_mark are > > synchronized by mark_mutex, but both of these functions call > > fsnotify_destroy_mark_locked which release the mark_mutex and acquire it again > > creating a subtle race window. There seems to be no reason for releasing > > mark_mutex, so this patch remove the mutex_unlock call. > > Thanks for report and the analysis. I agree with your problem analysis. > Indeed the loop in fsnotify_clear_marks_by_group_flags() isn't safe against > us dropping the mark_mutex inside fsnotify_destroy_mark_locked(). However > mark_mutex is dropped in fsnotify_destroy_mark_locked() for a purpose. We > call ->freeing_mark() callback from there and that should be called without > mark_mutex. In particular inotify uses this callback to send the IN_IGNORE > event and that code certainly isn't prepared to be called under mark_mutex > and you likely introduce interesting deadlock possibilities there. Right. inotify_handle_event() can recursively call fsnotify_destroy_mark() when IN_ONESHOT is set. I missed it, thanks for pointing out. > > Looking into this in more detail, it might be worthwhile to revisit how > mark_mutex is used since at least fanotify and dnotify use it for more than > just a protection of list of group marks and untangling this would simplify > things. But that's a longer term goal. > > A relatively simple fix for your issue is to split group list of marks into > a list of inode marks and a list of mount marks. Then destroying becomes > much simpler because we always discard the whole list (or both of them) and > we can easily avoid problems with list corruption when dropping the > mark_mutex. I can write the patch later or you can do that if you are Sorry I could not understand why the group's list of marks needs to be split. I was browsing through the old code, from the days mark_mutex was not present and it looked like below: void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags) { struct fsnotify_mark *lmark, *mark; LIST_HEAD(free_list); spin_lock(&group->mark_lock); list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) { if (mark->flags & flags) { list_add(&mark->free_g_list, &free_list); list_del_init(&mark->g_list); fsnotify_get_mark(mark); } } spin_unlock(&group->mark_lock); list_for_each_entry_safe(mark, lmark, &free_list, free_g_list) { fsnotify_destroy_mark(mark); fsnotify_put_mark(mark); } } How about using a temporary onstack list_head like above? > interested. > > Honza > > > > > Signed-off-by: Ashish Sangwan <a.sangwan@xxxxxxxxxxx> > > Reviewed-by: Amit Sahrawat <a.sahrawat@xxxxxxxxxxx> > > --- > > fs/notify/mark.c | 4 ---- > > 1 files changed, 0 insertions(+), 4 deletions(-) > > > > diff --git a/fs/notify/mark.c b/fs/notify/mark.c > > index 92e48c7..4ee419f 100755 > > --- a/fs/notify/mark.c > > +++ b/fs/notify/mark.c > > @@ -157,8 +157,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark, > > > > if (inode && (mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED)) > > iput(inode); > > - /* release lock temporarily */ > > - mutex_unlock(&group->mark_mutex); > > > > spin_lock(&destroy_lock); > > list_add(&mark->g_list, &destroy_list); > > @@ -191,8 +189,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark, > > */ > > > > atomic_dec(&group->num_marks); > > - > > - mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING); > > } > > > > void fsnotify_destroy_mark(struct fsnotify_mark *mark, > > -- > > 1.7.7 > > > > -- > Jan Kara <jack@xxxxxxx> > SUSE Labs, CR ÿôèº{.nÇ+‰·Ÿ®‰†+%ŠËÿ±éݶ¥Šwÿº{.nÇ+‰·¥Š{±ýûz÷¥þ)í…æèw*jg¬±¨¶‰šŽŠÝ¢jÿ¾«þG«?éÿ¢¸¢·¦j:+v‰¨ŠwèjØm¶Ÿÿþø¯ù®w¥þŠàþf£¢·hš?â?úÿ†Ù¥