Hi Steve,
On 19/11/2024 17:28, Steven Rostedt wrote:
On Tue, 19 Nov 2024 10:26:31 -0500
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
Can you do me a favor and send me privately a tarball of:
# cp -r /sys/kernel/tracing/events /tmp/events
# cd /tmp
# tar -cvjf events.tar.bz2 events
You can't call tar on the /sys/kernel/tracing files as those are pseudo
files with size of zero, and tar will just record empty files :-p
It crashes on parsing this:
name: mm_vmscan_write_folio
ID: 198
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned long pfn; offset:8; size:4; signed:0;
field:int reclaim_flags; offset:12; size:4; signed:1;
print fmt: "page=%p pfn=0x%lx flags=%s", (mem_map + ((REC->pfn) - (m68k_memory[0].addr >> 13))), REC->pfn, (REC->reclaim_flags) ? __print_flags(REC->reclaim_flags, "|", {0x0001u, "RECLAIM_WB_ANON"}, {0x0002u, "RECLAIM_WB_FILE"}, {0x0010u, "RECLAIM_WB_MIXED"}, {0x0004u, "RECLAIM_WB_SYNC"}, {0x0008u, "RECLAIM_WB_ASYNC"} ) : "RECLAIM_WB_NONE"
It shouldn't crash, but it also found a bug in your code ;-)
In my code is a really big assumption :-).
You reference two variables that are not part of the event:
"mem_map" and "m68k_memory[0].addr"
Do these variables ever change? Because the TP_printk() part of the
TRACE_EVENT() macro is called a long time after the event is recorded. It
could be seconds, minutes, days or even months (and unlikely possibly
years) later.
I am really not the best placed to answer.
AFAIK, it sounds like those are never changing.
The event takes place and runs the TP_fast_assign() to record the event in
the ring buffer. Then some time later, when you read the "trace" file, the
TP_printk() portion gets run. If you wait months before reading that, it is
executed months later.
Now you have "mem_map" and "m68k_memory[0].addr" in that output that gets
run months after the fact. Are they constant throughout the boot?
I don't know.
Now another issue is that user space has no idea what those values are. Now
user space can not print the values. Currently the code crashes because you
are the first one to reference a global value from a trace event print fmt.
That should probably be fixed to simply fail to parse the event and ignore
the print format logic (which defaults to just printing the raw fields).
The patch you sent works...
But, it fails a bit later:
Dispatching timerlat u procs
starting loop
User-space timerlat pid 230 on cpu 0
Segmentation fault
-- Steve