Re: [PATCH 4/4] fanotify: Use interruptible wait when waiting for permission events

Jan Kara <jack@xxxxxxx> · Wed, 13 Feb 2019 15:56:49 +0100

On Tue 12-02-19 08:40:13, Orion Poplawski wrote:
> On 1/9/19 2:23 AM, Jan Kara wrote:
> > On Wed 09-01-19 09:51:08, Amir Goldstein wrote:
> >> On Tue, Jan 8, 2019 at 6:46 PM Jan Kara <jack@xxxxxxx> wrote:
> >>>
> >>> When waiting for response to fanotify permission events, we currently
> >>> use uninterruptible waits. That makes code simple however it can cause
> >>> lots of processes to end up in uninterruptible sleep with hard reboot
> >>> being the only alternative in case fanotify listener process stops
> >>> responding (e.g. due to a bug in its implementation). Uninterruptible
> >>> sleep also makes system hibernation fail if the listener gets frozen
> >>> before the process generating fanotify permission event.
> >>>
> >>> Fix these problems by using interruptible sleep for waiting for response
> >>> to fanotify event. This is slightly tricky though - we have to
> >>> detect when the event got already reported to userspace as in that
> >>> case we must not free the event.
> >>
> >>> Instead we push the responsibility for
> >>> freeing the event to the process that will write response to the
> >>> event.
> >>
> >> It a bit hard to follow this patch. Can we make the patch that
> >> shifts responsibility to free the event a separate patch?
> >> fsnotify_remove_queued_event() helper can tag along with it
> >> or separate patch as you wish.
> > 
> > I'll have a look how to best split this. It should be possible.
> > 
> >>> Signed-off-by: Jan Kara <jack@xxxxxxx>
> >>> ---
> >>
> >> I think it would be good to add reported-by tand the links you provided
> >> in cover letter in this patch as well.
> > 
> > Good point. Will do.
> > 
> >>>
> >>> @@ -87,8 +116,8 @@ static int fanotify_get_response(struct fsnotify_group *group,
> >>>         if (response & FAN_AUDIT)
> >>>                 audit_fanotify(response & ~FAN_AUDIT);
> >>>
> >>> -       pr_debug("%s: group=%p event=%p about to return ret=%d\n", __func__,
> >>> -                group, event, ret);
> >>
> >> Looks useful. Why remove?
> > 
> > OK, I'll leave it alone.
> > 
> >>> +static void set_event_response(struct fsnotify_group *group,
> >>> +                              struct fanotify_perm_event_info *event,
> >>> +                              unsigned int response)
> >>> +{
> >>> +       spin_lock(&group->notification_lock);
> >>> +       /* Waiter got aborted by a signal? Free the event. */
> >>> +       if (unlikely(event->response == FAN_EVENT_CANCELED)) {
> >>> +               spin_unlock(&group->notification_lock);
> >>> +               fsnotify_destroy_event(group, &event->fae.fse);
> >>> +               return;
> >>> +       }
> >>> +       event->response = response | FAN_EVENT_ANSWERED;
> >>> +       spin_unlock(&group->notification_lock);
> >>> +}
> >>> +
> >>
> >> I don't understand. AFAICS, all callers of set_event_response()
> >> call immediately after spin_unlock(&group->notification_lock),
> >> without any user wait involved.
> >> I think it makes more sense for set_event_response() to assert the
> >> lock than it is to take it.
> >>
> >> Am I missing anything?
> > 
> > In case we need to destroy the event, we want to drop the
> > notification_lock. So to avoid a situation where set_event_response() drops
> > a lock it did not acquire (which is not very intuitive), I've decided for
> > the less efficient scheme of dropping and retaking the lock.
> > 
> > But maybe with better function name and some asserts, we could live with
> > dropping the lock inside the function without taking it.
> > 
> > 								Honza
> > 
> 
> 
> Any more progress here?  Thanks for your work on this, it's a real thorn in
> our side here.

I've sent v2 of the patches today to push things further.

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR