Re: Re: Re: [PATCH] fsnotify: fix a crash due to invalid virtual address

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Hello,

On Wed 24-06-15 10:13:50, Ashish Sangwan wrote:
> > On Tue 23-06-15 12:05:51, Ashish Sangwan wrote:
> > > > Looking into this in more detail, it might be worthwhile to revisit how
> > > > mark_mutex is used since at least fanotify and dnotify use it for more than
> > > > just a protection of list of group marks and untangling this would simplify
> > > > things. But that's a longer term goal.
> > > > 
> > > > A relatively simple fix for your issue is to split group list of marks into
> > > > a list of inode marks and a list of mount marks. Then destroying becomes
> > > > much simpler because we always discard the whole list (or both of them) and
> > > > we can easily avoid problems with list corruption when dropping the
> > > > mark_mutex. I can write the patch later or you can do that if you are
> > > Sorry I could not understand why the group's list of marks needs to be split.
> > > I was browsing through the old code, from the days mark_mutex was not present
> > > and it looked like below:
> > > void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
> > > 					 unsigned int flags)
> > > {
> > > 	struct fsnotify_mark *lmark, *mark;
> > > 	LIST_HEAD(free_list);
> > > 
> > > 	spin_lock(&group->mark_lock);
> > > 	list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
> > > 		if (mark->flags & flags) {
> > > 			list_add(&mark->free_g_list, &free_list);
> > > 			list_del_init(&mark->g_list);
> > > 			fsnotify_get_mark(mark);
> > > 		}
> > > 	}
> > > 	spin_unlock(&group->mark_lock);
> > > 
> > > 	list_for_each_entry_safe(mark, lmark, &free_list, free_g_list) {
> > > 		fsnotify_destroy_mark(mark);
> > > 		fsnotify_put_mark(mark);
> > > 	}
> > > }
> > > How about using a temporary onstack list_head like above?
> > 
> > So we can use a temporary list_head for entries selected for destruction as
> > well. *But* in either case you have to be careful because even the
> > temporary free_list can be modified by concurrent mark destruction e.g. from
> > fsnotify_free_marks_by_inode() call as each mark is in two lists - one
> > hanging from inode / mount and one hanging from the notification group.
> True.
> > 
> > Actually, looking into it even fsnotify_destroy_marks() doesn't seem to be
> > properly protected from the race. Sigh.
> True.
> 
> So anything which is calling fsnotify_destroy_mark is not protected against
> concurrent deletion from fsnotify_clear_marks_by_group_flags() and vice versa.
> Plus, as you rightly pointed, we have both the inode mark and vfsmount sharing the same list.
> So even if fsnotify_clear_marks_by_group_flags is for removing inode mark, a
> parallel fsnotify_destroy_mark for vfsmount can cause crash as they are sharing
> same list.

So I was looking more into the issue today and fsnotify_destroy_marks() is
fine in the end since it holds a reference to each mark (so it cannot be
freed) and it also uses a special free_list list_head within mark for its
temporary list so nobody removes any mark from that list.

> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index d90deaa..d83ec7d 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -124,14 +124,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
> 
>         spin_lock(&mark->lock);
> 
> -       /* something else already called this function on this mark */
> -       if (!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE)) {
> -               spin_unlock(&mark->lock);
> -               return;
> -       }
> -
> -       mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
> -
>         if (mark->flags & FSNOTIFY_MARK_FLAG_INODE) {
>                 inode = mark->i.inode;
>                 fsnotify_destroy_inode_mark(mark);
> @@ -188,7 +180,10 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
>                            struct fsnotify_group *group)
>  {
>         mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
> -       fsnotify_destroy_mark_locked(mark, group);
> +       if (mark->flags & FSNOTIFY_MARK_FLAG_ALIVE) {
> +               mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
> +               fsnotify_destroy_mark_locked(mark, group);
> +       }
>         mutex_unlock(&group->mark_mutex);
>  }
> 
> @@ -293,14 +288,27 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
>                                          unsigned int flags)
>  {
>         struct fsnotify_mark *lmark, *mark;
> -
> +       LIST_HEAD(free_list);
> +
>         mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
> -       list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
> +       /* Pass 1 : clear the alive flag and move to free list */
> +       list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list)
>                 if (mark->flags & flags) {
> +                       /*
> +                        * If the mark is present on group's mark list
> +                        * it has to be alive.
> +                        */
> +                       WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE));
> +                       list_del_init(&mark->g_list);
> +                       list_add(&mark->g_list, &free_list);
> +                       mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
> +               }
> +
> +       /* Pass 2: remove mark */
> +       list_for_each_entry_safe(mark, lmark, &free_list, g_list) {
>                         fsnotify_get_mark(mark);
>                         fsnotify_destroy_mark_locked(mark, group);
>                         fsnotify_put_mark(mark);
> -               }
>         }
>         mutex_unlock(&group->mark_mutex);
>  }

So this should fix the problem as well. However the lifetime rules are so
convoluted that I'm thinking how to make things simpler (especially to
unify how clearing via inode/mount and via group works). When investigating
the code I also found out there are possible issues with mark->inode
getting cleared while we are delivering some event. So I'm looking into
fixing all the problems.

									Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux