Re: ptrace problem with 2.6.25 on Itanium

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2008-04-24 at 12:39 +0200, stephane eranian wrote:
> Hello everyone,
> 
> I am running into a new problem with perfmon on Itanium and 2.6.25.
> 
> The pfmon tool is able to monitor across fork(). For that it relies on
> ptrace() to receive notifications on fork. This works fine on X86 and 2.6.25
> however it is currently broken on IA-64.
> 
> Normally, on fork(), the ptracing parent (here pfmon) receives 2 notifications:
> 
>    1. SIGTRAP with event PTRACE_EVENT_FORK to indicate a new process
>        is being created. New pid is extracted via PTRACE_GETEVENTMSG
> 
>    2. SIGSTOP with for new pid indicating that child is ready to
> execute its first
>        instruction
> 
> 
> The first message allow the tool to create the data structure to for
> new process,
> the second marks the point where a perfmon context can actually be attached.
> 
> With 2.6.25 on Itanium, the notifications are received out of order,
> i.e., the SIGTOP
> first and the FORK notification next. Of course, the tool is confused
> because until
> it sees the FORK event, it does not know the new process.
> 
> This situation never happens on X86 with the same kernel.
> 
> To demonstrate the problem, I have attached a simple test program. You need
> to pass the name of a command that creates child processes. Look at the order
> between the FORK and SIGSTOP notifications. There is a forktest program in
> pfmon/tests.
> 
> I don't have time to track this down. However, I am highly suspicious of this
> new TIF_RESTORE_RSE and the arch_ptrace_stop_needed() code. The do_fork()
> routine does indeed set SIGSTOP, before it call ptrace_notify(). But this does
> not impact X86, which, by the way, does not define arch_ptrace_stop_needed().
> I don't have an older kernel handy to run the test. Hopefully someone
> on this list
> will try this on 2.6.24 or older.

I tried it on SLES10, which is basically a 2.6.16 with a simplified
version of the patch (one which only uses arch_ptrace_stop, but not
TIF_RESTORE_RSE) and it works as expected:

glass:~/ptrace-wrong-notify # ./task_ptrace_attach ./forktest 10 10
creating 10 additional process(es)
10 iterations
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6199]
pid=6199 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6199]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6200]
pid=6200 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6200]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6201]
pid=6199 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6199]
pid=6200 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6200]
pid=6201 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6201]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6201 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6201]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6202]
pid=6202 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6202]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6203]
pid=6202 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6202]
pid=6203 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6203]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6203 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6203]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6204]
pid=6204 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6204]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6205]
pid=6204 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6204]
pid=6205 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6205]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6206]
pid=6205 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6205]
pid=6206 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6206]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6207]
pid=6206 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6206]
pid=6207 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6207]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=5
FORK new_pid [6208]
pid=6207 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6207]
pid=6208 errno=0 exited=0 stopped=1 signaled=0 stopsig=19
SIGSTOP from [6208]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6208 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6208]
pid=6198 errno=0 exited=0 stopped=1 signaled=0 stopsig=17
pid=6198 errno=0 exited=1 stopped=0 signaled=0 stopsig=0
EXITED [6198]

So, if something is broken, it must be the TIF_RESTORE_RSE part of the
patch, or an unexpected side effect of switching to the generic
sys_ptrace. I plan to have a look at mainline later today...

Kind regards,
Petr Tesarik

--
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [Sparc Linux]     [DCCP]     [Linux ARM]     [Yosemite News]     [Linux SCSI]     [Linux x86_64]     [Linux for Ham Radio]

  Powered by Linux