Re: [parisc] double restarts on multiple signal arrivals

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 18, 2012 at 04:15:42PM -0400, Carlos O'Donell wrote:

> I've hit "Mark as unread" twice on this thread now because I *am*
> interested in this conversation.
> 
> However, I must confess that the original email reads like a rant
> without a clear explanation of exactly what you think is wrong.
> 
> I'm familiar with the PARISC signal code because I worked on the
> 32/64-bit compat support for the frame layout, I also helped Randolph
> work on the syscall restart logic.
> 
> What I don't understand is what's wrong with it now, and your subject
> begins to hint at the problem.
> 
> Can you describe in more detail exactly what problem we need to solve?

Sure.  Have e.g. read(2) on a pipe called when you happen to have -512 in
r28 and have two signals arrive at once while inside that read().  Both
with user handlers.  Both with SA_RESTART.

When we enter the kernel, we have pt_regs created on kernel stack and
filled with the values we had in registers.  Among other things, the
value of r28 is stored in regs->orig_r28.  Note that arguments and the
syscall number are all passed through other registers, so setting the
arguments for syscall did not overwrite the value we used to have in
r28.  It's still -512.

We find which C function to call (sys_read) and go there, with return
address set to syscall_exit.

sys_read() ends up calling pipe_read(), which finds pipe empty and goes
to sleep.  The signals arrive and pipe_read() is woken up.  It sees that
	* signal_pending() is true
	* nothing has been read yet
so it return -ERESTARSYS.  sys_read() returns what it got from pipe_read().

Now we are at syscall_exit.  We set r1 to pt_regs created on stack when
we entered the syscall.  Return value of sys_read() is taken from r28 and
stored into regs->gr[28].  We check TIF_NEED_RESCHED and it's not set.
We'd reached syscall_check_sig and check TIF_SIGPENDING, which is set.
We go through syscall_do_signal and reach the call of
do_notify_resume(regs, 1), which calls do_signal(regs, 1).

There we call get_signal_to_deliver() and get one of those two signals
dequeued and returned.  Note that TIF_SIGPENDING is still set - there is
the second signal left.  We got positive signr, so we do not leave the
loop yet.  in_syscall is 1, so we call syscall_restart().

We check regs->gr[28], which is -ERESTARTSYS.  SA_RESTART is set in
sa_flags, so we fall through to the next case, which is
                regs->gr[31] -= 8; /* delayed branching */
                /* Preserve original r28. */
                regs->gr[28] = regs->orig_r28;
Note that ERESTARTSYS is 512, so the second assignment does nothing - we
still have regs->gr[28] equal to -ERESTARTSYS.  We are done with
syscall_restart().

Now we call handle_signal(), which calls setup_rt_frame().  sigframe is
allocated on userland stack and filled accordingly to the contents of *regs.
No errors occured, so we set regs->gr[31] to entry point of signal handler,
regs->gr[2] to address of sigreturn trampoline we'd created in the sigframe,
regs->gr[26..24] to arguments of handler, and regs->gr[30] to new value of
userland stack pointer.

Back in handle_signal(), we adjust the set of blocked signals and return 1.
do_signal() sees that RESTORE_SIGMASK is not set and it's done - we return
to do_notify_resume() and from there back into entry.S.

We reload registers from pt_regs and go to syscall_check_sig again.  There
we reread the thread flags and see that SIGPENDING is still set - we have
another signal to deal with.  We hit do_notify_resume() and do_signal()
again.  And again, in_syscall is 1.

do_signal() gets the remaining pending signal from get_signal_to_deliver().
Again, we call syscall_restart().  And neither regs->gr[28] nor regs->orig_r28
has been changed, so we hit the same case.  Assignment to regs->gr[28] is
a no-op again.  But look what happened to regs->gr[31] - it got decremented
by 8.  So it points 8 bytes before the entry point of the first handler now.

We proceed to set the second sigframe up.  The registers are saved in
sigcontext, adjusted so that they correspond to the call of the second
handler with future return to the second trampoline and we return all the
way out to entry.S.  There we go to syscall_check_sig again, but this
time we have no pending signals left, so we reach syscall_restore and
proceed to userland.

Now we are in the second handler.  It's executed to the end and when it
returns, we go to the address we had in r2.  I.e. the second trampoline.
There we get r25 set to 1, r20 set to __NR_rt_sigreturn and we go to
execute the syscall.  Which restores the registers from sigcontext and,
since its in_syscall argument is 1, sets ->gr[31] to sc->sc_iaoq.  Which
had been set by setup_rt_sigcontext() from regs->gr[31] at the moment it
had been called, i.e. the entry point of the first handler decremented by 8.  
Then we proceed to return to userland.  And arrive there are the right
arguments for the first handler, but instruction pointer is 8 bytes too
early.  Bad things happen, obviously.

The point where the things go wrong is the second call of syscall_restart().
It should've done nothing.  We need to do syscall restarts only when building
the first sigframe.  And usually it does work out that way, simply because
the value you happened to have in r28 is probably not one of the 4 that
make syscall_restart() do things.  So after the first time around it does
nothing.  Very similar logics on x86 does, indeed, work in all cases.
There the register used for return values happens to be the same as we use
to pass syscall number, so you can't arrange a call of syscall_restart()
analog that would trigger restart *and* leave the return value register
unchanged - you'd need %eax to be -ERESTART<something> when you enter the
syscall and that would immediately result in regs->ax set to -ENOSYS.  I.e.
no restarts would be done.  However, on parisc that doesn't work - you can
start with r28 being -ERESTARTSYS and get -ERESTARTSYS from syscall.

And no, I don't have any good suggestions on how to fix it right now.  Depends
on how does gdb on parisc play with pt_regs, among other things...

Sorry if the original posting sounded like a rant - I'd spent the last month
crawling through the assembler glue around signal hanlding all over arch/*
and that probably affected the tone.  Definitely had been affecting the amount
of spoken obscenities lately...
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux