Hi Steve, On 3/13/19 3:50 PM, Steven Rostedt wrote: > On Wed, 13 Mar 2019 15:03:35 +0100 > Claudio <claudio.fontana@xxxxxxxxx> wrote: > > >>>> >>>> When I read an event for a new process or thread >>>> (sched_process_fork), I need to know if it is a new thread or a >>>> new process at that time, and collect the tgid of the child. >>> >>> You are reading this at the time of tracing? >> >> Yes, my requirements are for live tracing of the low level events, >> and correlation of the events from all cores, in order to be able to >> immediately respond to the events from the tracing application. >> >> Tracing is expected to be always on in the system (no expected >> trace-first/analyze later pattern - everything related to the >> analysis of the events is automated and occurs at runtime). > > You could add a kprobe to find out the tgid as well. I think I am going to try this next. >> >> This works fine for the real time timing analysis at the thread >> level, but for the process level I need to know the tgid of the >> thread at the time I see the event. >>>> >>>> How can I get that information, if the update is done only at the >>>> time of the first "child" context-switch? >>>> >>>> 3) Is the saved_tgids map in sync with the reader of >>>> trace_pipe_raw, so that when reading an event from the pipe, the >>>> saved_tgids represent the particular state of the tid to tgid maps >>>> at the timestamp indicated in the event? >>> >>> Currently, it's only taken at the end of the trace. >> >> I wonder what the concept of the "end of the trace" is.. > > > I meant for "trace-cmd record", the saved_cmdlines are stored when the > tracing is finished, not during the trace. But the saved_cmdlines is > updated at each sched_switch, and so is the tgid. > > Now, we could add that info to the fork part as well. > > -- Steve > I have experimented with mainline, and then with mainline plus a patch to add the info for fork as well like this: diff --git a/kernel/trace/trace_sched_switch.c b/kernel/trace/trace_sched_switch.c index e288168..c5339ed 100644 --- a/kernel/trace/trace_sched_switch.c +++ b/kernel/trace/trace_sched_switch.c @@ -47,6 +47,20 @@ probe_sched_wakeup(void *ignore, struct task_struct *wakee) tracing_record_taskinfo(current, flags); } +static void +probe_sched_fork(void *ignore, + struct task_struct *parent, struct task_struct *child) +{ + int flags; + + flags = (RECORD_TGID * !!sched_tgid_ref) + + (RECORD_CMDLINE * !!sched_cmdline_ref); + + if (!flags) + return; + tracing_record_taskinfo(child, flags); +} + static int tracing_sched_register(void) { int ret; @@ -72,7 +86,16 @@ static int tracing_sched_register(void) goto fail_deprobe_wake_new; } + ret = register_trace_sched_process_fork(probe_sched_fork, NULL); + if (ret) { + pr_info("sched trace: Couldn't activate tracepoint" + " probe to kernel_sched_process_fork\n"); + goto fail_deprobe_switch; + } + return ret; +fail_deprobe_switch: + unregister_trace_sched_switch(probe_sched_switch, NULL); fail_deprobe_wake_new: unregister_trace_sched_wakeup_new(probe_sched_wakeup, NULL); fail_deprobe: @@ -82,6 +105,7 @@ static int tracing_sched_register(void) static void tracing_sched_unregister(void) { + unregister_trace_sched_process_fork(probe_sched_fork, NULL); unregister_trace_sched_switch(probe_sched_switch, NULL); unregister_trace_sched_wakeup_new(probe_sched_wakeup, NULL); unregister_trace_sched_wakeup(probe_sched_wakeup, NULL); -- It seems to work, but the model seems not to apply very well to my use case. saved_tgids cannot be reset, not even by disabling tracing completely echo 0 > tracing_on or echo 0 > options/record-tgids and the list just keeps growing, no entries are ever removed when tasks are destroyed etc, so my lookups become more and more expensive as the list grows. I don't think this is suited to be looked up frequently for my tracing scenario. Next I will experiment with: 1) kprobes, trying to generate an event before every sched class event carrying the tid to tgid mapping. 2) if everything fails, revert to use /proc/PID/stat, which however can be slow and racy if the process does not exist anymore when I read back the ftrace buffers. Thanks & ciao, Claudio