On Tue 12-02-19 08:40:13, Orion Poplawski wrote: > On 1/9/19 2:23 AM, Jan Kara wrote: > > On Wed 09-01-19 09:51:08, Amir Goldstein wrote: > >> On Tue, Jan 8, 2019 at 6:46 PM Jan Kara <jack@xxxxxxx> wrote: > >>> > >>> When waiting for response to fanotify permission events, we currently > >>> use uninterruptible waits. That makes code simple however it can cause > >>> lots of processes to end up in uninterruptible sleep with hard reboot > >>> being the only alternative in case fanotify listener process stops > >>> responding (e.g. due to a bug in its implementation). Uninterruptible > >>> sleep also makes system hibernation fail if the listener gets frozen > >>> before the process generating fanotify permission event. > >>> > >>> Fix these problems by using interruptible sleep for waiting for response > >>> to fanotify event. This is slightly tricky though - we have to > >>> detect when the event got already reported to userspace as in that > >>> case we must not free the event. > >> > >>> Instead we push the responsibility for > >>> freeing the event to the process that will write response to the > >>> event. > >> > >> It a bit hard to follow this patch. Can we make the patch that > >> shifts responsibility to free the event a separate patch? > >> fsnotify_remove_queued_event() helper can tag along with it > >> or separate patch as you wish. > > > > I'll have a look how to best split this. It should be possible. > > > >>> Signed-off-by: Jan Kara <jack@xxxxxxx> > >>> --- > >> > >> I think it would be good to add reported-by tand the links you provided > >> in cover letter in this patch as well. > > > > Good point. Will do. > > > >>> > >>> @@ -87,8 +116,8 @@ static int fanotify_get_response(struct fsnotify_group *group, > >>> if (response & FAN_AUDIT) > >>> audit_fanotify(response & ~FAN_AUDIT); > >>> > >>> - pr_debug("%s: group=%p event=%p about to return ret=%d\n", __func__, > >>> - group, event, ret); > >> > >> Looks useful. Why remove? > > > > OK, I'll leave it alone. > > > >>> +static void set_event_response(struct fsnotify_group *group, > >>> + struct fanotify_perm_event_info *event, > >>> + unsigned int response) > >>> +{ > >>> + spin_lock(&group->notification_lock); > >>> + /* Waiter got aborted by a signal? Free the event. */ > >>> + if (unlikely(event->response == FAN_EVENT_CANCELED)) { > >>> + spin_unlock(&group->notification_lock); > >>> + fsnotify_destroy_event(group, &event->fae.fse); > >>> + return; > >>> + } > >>> + event->response = response | FAN_EVENT_ANSWERED; > >>> + spin_unlock(&group->notification_lock); > >>> +} > >>> + > >> > >> I don't understand. AFAICS, all callers of set_event_response() > >> call immediately after spin_unlock(&group->notification_lock), > >> without any user wait involved. > >> I think it makes more sense for set_event_response() to assert the > >> lock than it is to take it. > >> > >> Am I missing anything? > > > > In case we need to destroy the event, we want to drop the > > notification_lock. So to avoid a situation where set_event_response() drops > > a lock it did not acquire (which is not very intuitive), I've decided for > > the less efficient scheme of dropping and retaking the lock. > > > > But maybe with better function name and some asserts, we could live with > > dropping the lock inside the function without taking it. > > > > Honza > > > > > Any more progress here? Thanks for your work on this, it's a real thorn in > our side here. I've sent v2 of the patches today to push things further. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR