Re: [PATCH RFC 0/2] Add basic tracing support for m68k

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Steve !

On 18/11/2024 21:20, Steven Rostedt wrote:

[ Added Tomas as he knows this code better than I do ]

Thanks !


On Mon, 18 Nov 2024 11:11:48 +0100
Jean-Michel Hautbois <jeanmichel.hautbois@xxxxxxxxxx> wrote:

Hi Steve,

On 15/11/2024 20:55, Steven Rostedt wrote:
On Fri, 15 Nov 2024 16:33:06 +0100
Jean-Michel Hautbois <jeanmichel.hautbois@xxxxxxxxxx> wrote:
Hi Steve,

On 15/11/2024 16:25, Steven Rostedt wrote:
On Fri, 15 Nov 2024 09:26:07 +0100
Jean-Michel Hautbois <jeanmichel.hautbois@xxxxxxxxxx> wrote:
Nevertheless it sounds like a really high latency for wake_up().

I have a custom driver which basically gets an IRQ, and calls wake_up on
a read() call. This wake_up() on a high cpu usage can be more than 1ms !
Even with a fifo/99 priority for my kernel thread !

I don't know if it rings any bell ?
I can obviously do more tests if it can help getting down to the issue :-).

Try running timerlat.

Thanks !
Here is what I get:
# echo timerlat > current_tracer
# echo 1 > events/osnoise/enable
# echo 25 > osnoise/stop_tracing_total_us
# tail -10 trace
               bash-224     [000] d.h..   153.268917: #77645 context  irq timer_latency     45056 ns
               bash-224     [000] dnh..   153.268987: irq_noise: timer:206  start 153.268879083 duration 93957 ns
               bash-224     [000] d....   153.269056: thread_noise:  bash:224 start 153.268905324 duration 71045 ns
         timerlat/0-271     [000] .....   153.269103: #77645 context thread timer_latency    230656 ns
               bash-224     [000] d.h..   153.269735: irq_noise: timer:206 start 153.269613847 duration 103558 ns
               bash-224     [000] d.h..   153.269911: #77646 context irq timer_latency     40640 ns
               bash-224     [000] dnh..   153.269982: irq_noise: timer:206 start 153.269875367 duration 93190 ns
               bash-224     [000] d....   153.270053: thread_noise: bash:224 start 153.269900969 duration 72709 ns
         timerlat/0-271     [000] .....   153.270100: #77646 context thread timer_latency    227008 ns
         timerlat/0-271     [000] .....   153.270155: timerlat_main: stop tracing hit on cpu 0

It looks awful, right ?

awful is relative ;-) If that was on x86, I would say it was bad.

Also check out rtla (in tools/trace/rtla).

Thanks ! I knew it only by name, so I watched a presentation recorded
during OSS summit given by Daniel Bristot de Oliveira who wrote it and
it is really impressive !

I had to modify the source code a bit, as it does not compile with my
uclibc toolchain:
diff --git a/tools/tracing/rtla/Makefile.rtla
b/tools/tracing/rtla/Makefile.rtla
index cc1d6b615475..b22016a88d09 100644
--- a/tools/tracing/rtla/Makefile.rtla
+++ b/tools/tracing/rtla/Makefile.rtla
@@ -15,7 +15,7 @@ $(call allow-override,LD_SO_CONF_PATH,/etc/ld.so.conf.d/)
   $(call allow-override,LDCONFIG,ldconfig)
   export CC AR STRIP PKG_CONFIG LD_SO_CONF_PATH LDCONFIG

-FOPTS          := -flto=auto -ffat-lto-objects -fexceptions
-fstack-protector-strong   \
+FOPTS          := -flto=auto -ffat-lto-objects -fexceptions \
                  -fasynchronous-unwind-tables -fstack-clash-protection
   WOPTS          := -O -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2             \
                  -Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized

I'm not sure what the consequence of the above would be. Perhaps Daniel
just copied this from other code?

diff --git a/tools/tracing/rtla/src/timerlat_u.c
b/tools/tracing/rtla/src/timerlat_u.c
index 01dbf9a6b5a5..92ad2388b123 100644
--- a/tools/tracing/rtla/src/timerlat_u.c
+++ b/tools/tracing/rtla/src/timerlat_u.c
@@ -15,10 +15,16 @@
   #include <pthread.h>
   #include <sys/wait.h>
   #include <sys/prctl.h>
+#include <sys/syscall.h>

   #include "utils.h"
   #include "timerlat_u.h"

+static inline pid_t gettid(void)
+{
+       return syscall(SYS_gettid);
+}
+
   /*
    * This is the user-space main for the tool timerlatu/ threads.
    *
diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c
index 9ac71a66840c..b754dc1016a4 100644
--- a/tools/tracing/rtla/src/utils.c
+++ b/tools/tracing/rtla/src/utils.c
@@ -229,6 +229,9 @@ long parse_ns_duration(char *val)
   #elif __s390x__
   # define __NR_sched_setattr    345
   # define __NR_sched_getattr    346
+#elif __m68k__
+# define __NR_sched_setattr    349
+# define __NR_sched_getattr    350
   #endif

   #define SCHED_DEADLINE         6

But it is not enough, as executing rtla fails with a segfault.
I can dump a core, but I could not manage to build gdb for my board so I
can't debug it (I don't know how to debug a coredump without gdb !).

printf()!  That's how I debug things without gdb ;-)

Indeed printf gave me clues !
It appears to be a bug in libtracefs (v1.8.1). rtla segfaults when calling tracefs_local_events() in trace_instance_init().

Debugging libtracefs pointed me to the load_events() function, and the segfault happens after tep_parse_event() is called for "/sys/kernel/debug/tracing/events/vmscan/mm_vmscan_write_folio/format".

Going through the calls I get to event_read_print_args().
I changed libtraceevent log level to get the warnings, and it says:
libtraceevent: Resource temporarily unavailable
  unknown op '.'
Segmentation fault

JM









[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux