On 05/22, Deepa Dinamani wrote: > > > > > --- a/include/linux/sched/signal.h > > > > +++ b/include/linux/sched/signal.h > > > > @@ -416,7 +416,6 @@ void task_join_group_stop(struct task_struct *task); > > > > static inline void set_restore_sigmask(void) > > > > { > > > > set_thread_flag(TIF_RESTORE_SIGMASK); > > > > - WARN_ON(!test_thread_flag(TIF_SIGPENDING)); > > > > > > So you always want do_signal() to be called? > > > > Why do you think so? No. This is just to avoid the warning, because with the > > patch I sent set_restore_sigmask() is called "in advance". > > > > > You will have to check each architecture's implementation of > > > do_signal() to check if that has any side effects. > > > > I don't think so. > > Why not? Why yes? it seems that we have some communication problems. OK, please look at the code I proposed, I only added a couple of TODO comments static inline void set_restore_sigmask(void) { // WARN_ON(!TIF_SIGPENDING) was removed by this patch current->restore_sigmask = true; } int set_user_sigmask(const sigset_t __user *umask, size_t sigsetsize) { sigset_t *kmask; if (!umask) return 0; if (sigsetsize != sizeof(sigset_t)) return -EINVAL; if (copy_from_user(kmask, umask, sizeof(sigset_t))) return -EFAULT; set_restore_sigmask(); current->saved_sigmask = current->blocked; set_current_blocked(kmask); return 0; } SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct epoll_event __user *, events, int, maxevents, int, timeout, const sigset_t __user *, sigmask, size_t, sigsetsize) { int error; /* * If the caller wants a certain signal mask to be set during the wait, * we apply it here. */ error = set_user_sigmask(sigmask, sigsetsize); if (error) return error; error = do_epoll_wait(epfd, events, maxevents, timeout); // TODO. Add another helper to restore WARN_ON(!TIF_SIGPENDING) // in case restore_saved_sigmask() is NOT called. if (error != -EINTR) restore_saved_sigmask(); return error; } Note that it looks much simpler. Now, could you please explain - why do you think this code is not correct ? - why do you think we need to audit do_signal() ??? > > > Although this is not what the patch is solving. > > > > Sure. But you know, after I tried to read the changelog, I am not sure > > I understand what exactly you are trying to fix. Could you please explain > > this part > > > > The behavior > > before 854a6ed56839a was that the signals were dropped after the error > > code was decided. This resulted in lost signals but the userspace did not > > notice it > > > > ? I fail to understand it, sorry. It looks as if the code was already buggy before > > that commit and it could miss a signal or something like this, but I do not see how. > > Did you read the explanation pointed to in the commit text? : > > https://lore.kernel.org/linux-fsdevel/20190427093319.sgicqik2oqkez3wk@dcvr/ this link points to the lengthy and confusing discussion... after a quick glance I didn't find an answer to my question, so let me repeat it again: why do you think the kernel was buggy even before 854a6ed56839a40f6b5d02a2962f48841482eec4 ("signal: Add restore_user_sigmask()") ? Just in case... https://lore.kernel.org/linux-fsdevel/CABeXuvq7gCV2qPOo+Q8jvNyRaTvhkRLRbnL_oJ-AuK7Sp=P3QQ@xxxxxxxxxxxxxx/ doesn't look right to me... let me quite some parts of your email: - /* - * If we changed the signal mask, we need to restore the original one. - * In case we've got a signal while waiting, we do not restore the - * signal mask yet, and we allow do_signal() to deliver the signal on - * the way back to userspace, before the signal mask is restored. - */ - if (sigmask) { - if (error == -EINTR) { - memcpy(¤t->saved_sigmask, &sigsaved, - sizeof(sigsaved)); - set_restore_sigmask(); - } else **** Execution reaches this else statement and the sigmask is restored directly, ignoring the newly generated signal. I see nothing wrong. This is what we want. The signal is never handled. Well, "never" is not right. It won't be handled now, because it is blocked, but for example think of another pselect/whatever call with the same sigmask. > It would be better to understand the isssue before we start discussing the fix. Agreed. And that is why I am asking for your explanations, quite possibly I missed something, but so far I fail to understand you. Oleg.