On Thu, May 23, 2019 at 11:06 AM Deepa Dinamani <deepa.kernel@xxxxxxxxx> wrote: > > Ok, since there has been quite a bit of argument here, I will > backtrack a little bit and maybe it will help us understand what's > happening here. > There are many scenarios being discussed on this thread: > a. State of code before 854a6ed56839a > b. State after 854a6ed56839a > c. Proposed fix as per the patchset in question. > > Oleg, I will discuss these first and then we can discuss the > additional changes you suggested. > > Some background on why we have these syscalls that take sigmask as an > argument. This is just for the sake of completeness of the argument. > > These are particularly meant for a scenario(d) such as below: > > 1. block the signals you don't care about. > 2. syscall() > 3. unblock the signals blocked in 1. > > The problem here is that if there is a signal that is not blocked by 1 > and such a signal is delivered between 1 and 2(since they are not > atomic), the syscall in 2 might block forever as it never found out > about the signal. > > As per [a] and let's consider the case of epoll_pwait only first for simplicity. > > As I said before, ep_poll() is what checks for signal_pending() and is > responsible for setting errno to -EINTR when there is a signal. > > So if a signal is received after ep_poll() and ep_poll() returns > success, it is never noticed by the syscall during execution. > So the question is does the userspace have to know about this signal > or not. From scenario [d] above, I would say it should, even if all > the fd's completed successfully. > This does not happen in [a]. So this is what I said was already broken. > > What [b] does is to move the signal check closer to the restoration of > the signal. This way it is good. So, if there is a signal after > ep_poll() returns success, it is noticed and the signal is delivered > when the syscall exits. But, the syscall error status itself is 0. > > So now [c] is adjusting the return values based on whether extra > signals were detected after ep_poll(). This part was needed even for > [a]. > > Let me know if this clarifies things a bit. Just adding a little more clarification, there is an additional change between [a] and [b]. As per [a] we would just restore the signal instead of changing the saved_sigmask and the signal could get delivered right then. [b] changes this to happen at syscall exit: void restore_user_sigmask(const void __user *usigmask, sigset_t *sigsaved) { <snip> /* * When signals are pending, do not restore them here. * Restoring sigmask here can lead to delivering signals that the above * syscalls are intended to block because of the sigmask passed in. */ if (signal_pending(current)) { current->saved_sigmask = *sigsaved; set_restore_sigmask(); return; } -Deepa