Re: ptrace() behaviour when doing PTRACE_KILL

Pablo Antonio <pabloa@xxxxxxxxx> · Tue, 30 Aug 2011 18:02:15 -0300

Hi Oleg,

On Tue, Aug 30, 2011 at 10:30 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> Pablo Antonio wrote:
>>
>> Hello,
>>    I was writing some user code that uses the ptrace() system call
>> and found it wasn't behaving as I expected. Maybe this is simply
>> because I don't know enough about ptrace()'s behaviour, so I'd like
>> someone to enlighten me with some information.
>
> First of all, please never use PTRACE_KILL. it is more or less
> equivalent to ptrace(PTRACE_CONT, pid, 0, SIGKILL) but doesn't return
> the error.
>
>> So, here's the code: http://codepad.org/TJ7Lua4Q Basically,
>
> I assume you run it on 32bit machine... otherwise you need
> "8 * ORIG_RAX" and "8 * RAX".

Yes, x86-32.

>
>> I was
>> expecting the parent process to kill the child when the later
>> attempted to use the kill() system call, not letting it send signals
>> to other processes. I had done this before with pretty much the same
>> code, but instead of killing when kill() was called, I was doing it
>> when a fork() or clone() was attempted. And it worked.
>
> Because do_fork() checks signal_pending() and aborts if the process
> is killed (actually it aborts if any signal is pending).
>
>> But now that I'm doing the same thing with kill(), I'm seeing that the
>> child process ends up killed, as expected, but the signal is sent
>> nonetheless. I understand that when I do the PTRACE_KILL, a SIGKILL
>> signal is sent to the child. And PTRACE_KILL is used on syscall
>> entrance in that code. If the SIGKILL signal is delivered to the
>> process before it continues executing the system call in kernelspace,
>
> In this particular case SIGKILL was already delivered, yes.
>
>> then I don't understand why this code doesn't work as expected.
>
> Because you didn't abort this syscall. The tracee calls sys_kill()
> which doesn't check the signals, the pending SIGKILL will be noticed
> later, when the tracee attampts to return to the user-mode.
>
> This is correct. Most of non-blocking syscalls do not check the
> signals, this is pointless.

Thanks for your answer, Oleg. What you are saying makes sense to me,
but I must say that I talked with some people (one of them was a GDB
developer, the other was Denys Vlasenko) and they tested the code I
you saw with different kernels and the signal was inhibited. Both of
them, I think, were using redhat/fedora kernels.

Maybe there's a race condition happening, maybe redhat/fedora kernels
have different behaviour, I don't know. But I guess that if there's a
difference in behaviour between vanilla and fedora/redhat kernels,
that should be fixed.

Thanks for your time,

-- 
Pablo Antonio (AKA crazy2k)
http://www.pablo-a.com.ar/

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies