Hi Steve,
On 19/11/2024 19:10, Steven Rostedt wrote:
On Tue, 19 Nov 2024 19:06:45 +0100
Jean-Michel Hautbois <jeanmichel.hautbois@xxxxxxxxxx> wrote:
It shouldn't crash, but it also found a bug in your code ;-)
In my code is a really big assumption :-).
Well, not your personally, but I meant "your" as in m68k code.
You reference two variables that are not part of the event:
"mem_map" and "m68k_memory[0].addr"
Do these variables ever change? Because the TP_printk() part of the
TRACE_EVENT() macro is called a long time after the event is recorded. It
could be seconds, minutes, days or even months (and unlikely possibly
years) later.
I am really not the best placed to answer.
AFAIK, it sounds like those are never changing.
That would mean they are OK and will not corrupt the trace, but it will be
meaningless for tools like perf and trace-cmd.
The event takes place and runs the TP_fast_assign() to record the event in
the ring buffer. Then some time later, when you read the "trace" file, the
TP_printk() portion gets run. If you wait months before reading that, it is
executed months later.
Now you have "mem_map" and "m68k_memory[0].addr" in that output that gets
run months after the fact. Are they constant throughout the boot?
I don't know.
Now another issue is that user space has no idea what those values are. Now
user space can not print the values. Currently the code crashes because you
are the first one to reference a global value from a trace event print fmt.
That should probably be fixed to simply fail to parse the event and ignore
the print format logic (which defaults to just printing the raw fields).
The patch you sent works...
But, it fails a bit later:
Dispatching timerlat u procs
starting loop
User-space timerlat pid 230 on cpu 0
Segmentation fault
More printk? ;-)
Indeed, but the result is not straightforward this time :-(.
Long story short: it fails at kbuffer_load_subbuffer() call in
read_cpu_pages().
I added printf in the kbuffer helpers in libevent, and it finishes at:
__read_long_4: call read_4 at 0x600230c2
__read_4_sw: ptr=0x8044e2ac
static unsigned int __read_4_sw(void *ptr)
{
printf("%s: ptr=%p, value: %08x\n", __func__, ptr, *(unsigned int *)ptr);
unsigned int data = *(unsigned int *)ptr;
printf("%s: data=%08x\n", __func__, data);
return swap_4(data);
}
As soon as ptr is dereferenced, the segfault appears.
ptr should be ok though, as the address is valid afaik...
I must say that now I am stuck :-(.
Thanks,
JM