On Wed, 18 Dec 2024 13:04:27 -0500 Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > It's very much a part of the standard printf format, and is very much > > inherent to the whole varargs and C integer promotion rules (ie you > > literally *cannot* pass an actual 'char' value to a varargs function - > > the normal C integer type extension rules apply). > > > > So this is not really some odd kernel extension, and while there are > > only a handful of users in tracing (that apparently trace-cmd cannot > > deal with), it's not even _that_ uncommon in general: > > trace-cmd (and libtraceevent for that matter) does handle "%h" and "%hh" > as well. > > But the vbin_printf() which trace_printk() uses is a different beast, and > requires rebuilding the arguments so that it can be parsed, and there "%h" > isn't supported. Just to state the difference between TP_printk() and trace_printk() is that with trace events only the data is saved to the ring buffer. For example, for the sched_waking trace event: TRACE_EVENT(sched_waking, TP_PROTO(struct task_struct *p), TP_ARGS(__perf_task(p)), TP_STRUCT__entry( __array( char, comm, TASK_COMM_LEN ) __field( pid_t, pid ) __field( int, prio ) __field( int, target_cpu ) ), TP_fast_assign( memcpy(__entry->comm, p->comm, TASK_COMM_LEN); __entry->pid = p->pid; __entry->prio = p->prio; /* XXX SCHED_DEADLINE */ __entry->target_cpu = task_cpu(p); ), TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d", __entry->comm, __entry->pid, __entry->prio, __entry->target_cpu) ); [ Note, I made this into a TRACE_EVENT() but in reality it's multiple events that uses DECLARE_EVENT_CLASS() and DEFINE_EVENT(), but the idea is still the same. ] TP_STRUCT__entry( __array( char, comm, TASK_COMM_LEN ) __field( pid_t, pid ) __field( int, prio ) __field( int, target_cpu ) ), That turns into: struct trace_event_raw_sched_waking { struct trace_entry ent; char comm[TASK_COMM_LEN]; pid_t pid; int prio; int target_cpu; char __data[]; } Then we have how to load that structure: TP_fast_assign( memcpy(__entry->comm, p->comm, TASK_COMM_LEN); __entry->pid = p->pid; __entry->prio = p->prio; /* XXX SCHED_DEADLINE */ __entry->target_cpu = task_cpu(p); ), Where the "__entry" is of type struct trace_event_raw_sched_waking and points into the reserved location in the ring buffer. This has the above assignments write directly into the ring buffer and avoids any copying. Now the "trace" file needs to know how to print it, that's where the TP_printk() is. It is basically a vsprintf(TP_printk()) with the __entry again pointing to the content of the ring buffer. But trace_printk() is not a trace event and requires writing the format as well as the data into the ring buffer when it is called. It use to simply just use vsnprintf() but it was considered much faster to not do the formatting during the record and to push it back to when it is read. As trace_printk() is used specifically to find hard to hit bugs, to keep it from causing "heisenbugs", using vbin_printf() proved to be much faster and made trace_printk() less intrusive to debugging. For historical analysis, here's where it was first introduced: https://lore.kernel.org/lkml/49aa0c73.1c07d00a.4fc6.ffffb4d7@xxxxxxxxxxxxx/ -- Steve