Hi Oleg, On Tue, Aug 30, 2011 at 10:30 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > Pablo Antonio wrote: >> >> Hello, >> I was writing some user code that uses the ptrace() system call >> and found it wasn't behaving as I expected. Maybe this is simply >> because I don't know enough about ptrace()'s behaviour, so I'd like >> someone to enlighten me with some information. > > First of all, please never use PTRACE_KILL. it is more or less > equivalent to ptrace(PTRACE_CONT, pid, 0, SIGKILL) but doesn't return > the error. > >> So, here's the code: http://codepad.org/TJ7Lua4Q Basically, > > I assume you run it on 32bit machine... otherwise you need > "8 * ORIG_RAX" and "8 * RAX". Yes, x86-32. > >> I was >> expecting the parent process to kill the child when the later >> attempted to use the kill() system call, not letting it send signals >> to other processes. I had done this before with pretty much the same >> code, but instead of killing when kill() was called, I was doing it >> when a fork() or clone() was attempted. And it worked. > > Because do_fork() checks signal_pending() and aborts if the process > is killed (actually it aborts if any signal is pending). > >> But now that I'm doing the same thing with kill(), I'm seeing that the >> child process ends up killed, as expected, but the signal is sent >> nonetheless. I understand that when I do the PTRACE_KILL, a SIGKILL >> signal is sent to the child. And PTRACE_KILL is used on syscall >> entrance in that code. If the SIGKILL signal is delivered to the >> process before it continues executing the system call in kernelspace, > > In this particular case SIGKILL was already delivered, yes. > >> then I don't understand why this code doesn't work as expected. > > Because you didn't abort this syscall. The tracee calls sys_kill() > which doesn't check the signals, the pending SIGKILL will be noticed > later, when the tracee attampts to return to the user-mode. > > This is correct. Most of non-blocking syscalls do not check the > signals, this is pointless. Thanks for your answer, Oleg. What you are saying makes sense to me, but I must say that I talked with some people (one of them was a GDB developer, the other was Denys Vlasenko) and they tested the code I you saw with different kernels and the signal was inhibited. Both of them, I think, were using redhat/fedora kernels. Maybe there's a race condition happening, maybe redhat/fedora kernels have different behaviour, I don't know. But I guess that if there's a difference in behaviour between vanilla and fedora/redhat kernels, that should be fixed. Thanks for your time, -- Pablo Antonio (AKA crazy2k) http://www.pablo-a.com.ar/ _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies