Hello, Here is the 2nd version of RFC series to add boot-time tracing using devicetree. Previous thread is here. https://lkml.kernel.org/r/156113387975.28344.16009584175308192243.stgit@devnote2 In this version, I moved the ftrace node under /chosen/linux,ftrace and remove compatible property, because it must be in fixed place. Also this version has following features; - Introduce "instance" node, which can have events nodes for setting events filters and actions for the trace instance. - Introduce "cpumask" property - Introduce "ftrace-filters" and "ftrace-notraces" - Introduce "fgraph-filters", "fgraph-notraces" and "fgraph-max-depth" At this moment, this feature is only available on the architecutre which supports devicetree. For x86, we can use it on qemu with --dtb option, or apply below patch on grub to add devicetree support on grub-x86. https://github.com/mhiramat/grub/commit/644c35bfd2d18c772cc353b74215344f8264923a Note that the devicetree for x86 must contain the nodes only under /chosen/, or it may cause a problem if it conflicts with ACPI. (Maybe we need to disable the FDT nodes except for nodes under /chosen on boot if ACPI exists.) This series can be applied on Steve's tracing tree (ftrace/core) or available on below https://github.com/mhiramat/linux.git ftrace-devicetree-v2 Usage ====== With this series, we can setup new kprobe and synthetic events, more complicated event filters and trigger actions including histogram via devicetree. For example, following kernel parameters trace_options=sym-addr trace_event=initcall:* tp_printk trace_buf_size=1M ftrace=function ftrace_filter="vfs*" it can be written in devicetree like below. /{ chosen { ... ftrace { options = "sym-addr"; events = "initcall:*"; tp-printk; buffer-size-kb = <0x400>; // 1024KB == 1MB ftrace-filters = "vfs*"; }; Moreover, now we can expand it to add filters for events, kprobe events, and synthetic events with histogram like below. ftrace { ... event0 { event = "task:task_newtask"; filter = "pid < 128"; // adding filters enable; }; event1 { event = "kprobes:vfs_read"; probes = "vfs_read $arg1 $arg2"; // add kprobes filter = "common_pid < 200"; enable; }; event2 { event = "initcall_latency"; // add synth event fields = "unsigned long func", "u64 lat"; // with histogram actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"; }; // and synthetic event callers event3 { event = "initcall:initcall_start"; actions = "hist:keys=func:ts0=common_timestamp.usecs"; }; event4 { event = "initcall:initcall_finish"; actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"; }; }; Also, this version supports "instance" node, which allows us to run several tracers for different purpose at once. For example, one tracer is for tracing functions in module alpha, and others tracing module beta, you can write followings. ftrace { instance0 { instance = "foo"; tracer = "function"; ftrace-filters = "*:mod:alpha"; }; instance1 { instance = "bar"; tracer = "function"; ftrace-filters = "*:mod:beta"; }; }; The instance node also accepts event nodes so that each instance can customize its event tracing. Discussion ===== On the previous thread, we discussed that the this devicetree usage itself was acceptable or not. Fortunately, I had a chance to discuss it in a F2F meeting with Frank and Tim last week. I think the advantages of using devicetree are, - reuse devicetree's structured syntax for complicated tracefs settings - reuse OF-APIs in linux kernel to accept and parse it - reuse dtc complier to compile it and validate syntax. (with yaml schema, we can enhance it) - reuse current bootloader (and qemu) to load it And we talked about some other ideas to avoid using devicetree. - expand kernel command line (ascii command strings) - expand kernel command line with base64 encoded comressed ascii command strings - load (compressed) ascii command strings to somewhere on memory and pass the address via kernel cmdline - load (compressed) ascii command strings to somewhere on memory and pass the address via /chosen node (as same as initrd) - load binary C data and point it from kernel cmdline - load binary C data and point it from /chosen node (as same as initrd) - load binary C data as a section of kernel image The first 2 ideas expand the kernel's cmdline to pass some "magic" command to setup ftrace. In both case, the problems are the maximal size of cmdline and the issues related to the complexity of commands. My example showed that the ftrace settings becomes long even if making one histogram, which can be longer than 256 bytes. The long and complex data can easily lead mis-typing, but cmdline has no syntax validator, it just ignores the mis-typed commands. (Of course even with the devicetree, it must be smaller than 2 pages) Next 2 ideas are similar, but load the commands on some other memory area and pass only address via cmdline. This solves the size limitation issue, but still no syntax validation. Of course we can make a new structured syntax validator similar to (or just forked from) dt-validate. The problem (or disadvantage) of these (and following) ideas, is to change the kernel and boot loaders to load another binary blobs on memory. Maybe if we introduce a generic structured kernel boot arguments, which is a kind of /chosen node of devicetree. (But if there is already such hook, why we make another one...?) Also, this "GSKBA" may introduce a parser and access APIs which will be very similar to OF-APIs. This also seems redundant to me. So the last 3 ideas will avoid introducing new parser and APIs, we just compile the data as C data and point it from cmdline or somewhere else. With these ideas, we still need to expand boot loaders to support loading new binary blobs. (And the last one requires to add elf header parser/modifier to boot loader too) >From the above reasons, I think using devicetree's /chosen node is the least intrusive way to introduce this boot-time tracing feature. Any suggestions, thoughts? Thank you, --- Masami Hiramatsu (15): tracing: Apply soft-disabled and filter to tracepoints printk tracing: kprobes: Output kprobe event to printk buffer tracing: Expose EXPORT_SYMBOL_GPL symbol tracing: kprobes: Register to dynevent earlier stage tracing: Accept different type for synthetic event fields tracing: Add NULL trace-array check in print_synth_event() dt-bindings: tracing: Add ftrace binding document tracing: of: Add setup tracing by devicetree support tracing: of: Add trace event settings tracing: of: Add kprobe event support tracing: of: Add synthetic event support tracing: of: Add instance node support tracing: of: Add cpumask property support tracing: of: Add function tracer filter properties tracing: of: Add function-graph tracer option properties .../devicetree/bindings/chosen/linux,ftrace.yaml | 306 ++++++++++++ include/linux/trace_events.h | 1 kernel/trace/Kconfig | 10 kernel/trace/Makefile | 1 kernel/trace/ftrace.c | 85 ++- kernel/trace/trace.c | 90 ++-- kernel/trace/trace_events.c | 3 kernel/trace/trace_events_hist.c | 14 - kernel/trace/trace_events_trigger.c | 2 kernel/trace/trace_kprobe.c | 81 ++- kernel/trace/trace_of.c | 507 ++++++++++++++++++++ 11 files changed, 1004 insertions(+), 96 deletions(-) create mode 100644 Documentation/devicetree/bindings/chosen/linux,ftrace.yaml create mode 100644 kernel/trace/trace_of.c -- Masami Hiramatsu (Linaro) <mhiramat@xxxxxxxxxx>