First, apply the correct heading level based on the section numbering. Then remove leading section numbers altogether, since they can quickly get out of order when done manually (see the double "5.3" in events.rst, and the out-of-order numbering in histogram.rst, where even some sections are numbered and others are not). Finally, section numbers are autogenerated anyways in the PDF builds, leading to strange doubly numerated sections like "14.2.3 6.2 ‘hist’ trigger examples". In kprobetrace.rst, the now dangling reference is in a literal section where :ref: roles are not parsed (and the whole section could benefit from being transformed into reST list syntax, which is however not the focus of this patch); so replace the now invisible section number with the section title instead. Signed-off-by: Roland Hieber <rhi@xxxxxxxxxxxxxx> --- PATCH v1: https://lore.kernel.org/all/20200609141027.21508-1-rhi@xxxxxxxxxxxxxx PATCH v1 -> v2: - rebase to current master - split up the patch into a three-patch series - introduce refs for cross-referencing (feedback by Jonathan Corbet) (see previous patch in this series) - dangling reference in kprobetrace.rst, see above --- Documentation/trace/events-kmem.rst | 20 ++--- Documentation/trace/events-power.rst | 18 ++-- Documentation/trace/events.rst | 96 ++++++++++----------- Documentation/trace/ftrace.rst | 2 +- Documentation/trace/histogram.rst | 32 +++---- Documentation/trace/kprobetrace.rst | 3 +- Documentation/trace/tracepoint-analysis.rst | 56 ++++++------ 7 files changed, 114 insertions(+), 113 deletions(-) diff --git a/Documentation/trace/events-kmem.rst b/Documentation/trace/events-kmem.rst index 68fa75247488..0e8904be15a3 100644 --- a/Documentation/trace/events-kmem.rst +++ b/Documentation/trace/events-kmem.rst @@ -14,8 +14,8 @@ within the kernel. Broadly speaking there are five major subheadings. This document describes what each of the tracepoints is and why they might be useful. -1. Slab allocation of small objects of unknown type -=================================================== +Slab allocation of small objects of unknown type +================================================ :: kmalloc call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s @@ -29,8 +29,8 @@ kmalloc with kfree, it may be possible to identify memory leaks and where the allocation sites were. -2. Slab allocation of small objects of known type -================================================= +Slab allocation of small objects of known type +============================================== :: kmem_cache_alloc call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s @@ -42,8 +42,8 @@ it is likely easier to pin the event down to a specific cache. At the time of writing, no information is available on what slab is being allocated from, but the call_site can usually be used to extrapolate that information. -3. Page allocation -================== +Page allocation +=============== :: mm_page_alloc page=%p pfn=%lu order=%d migratetype=%d gfp_flags=%s @@ -71,8 +71,8 @@ freed in batch with a page list. Significant amounts of activity here could indicate that the system is under memory pressure and can also indicate contention on the lruvec->lru_lock. -4. Per-CPU Allocator Activity -============================= +Per-CPU Allocator Activity +========================== :: mm_page_alloc_zone_locked page=%p pfn=%lu order=%u migratetype=%d cpu=%d percpu_refill=%d @@ -100,8 +100,8 @@ and drains on another could be a factor in causing large amounts of cache line bounces due to writes between CPUs and worth investigating if pages can be allocated and freed on the same CPU through some algorithm change. -5. External Fragmentation -========================= +External Fragmentation +====================== :: mm_page_alloc_extfrag page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d diff --git a/Documentation/trace/events-power.rst b/Documentation/trace/events-power.rst index f45bf11fa88d..1f60cfd03c97 100644 --- a/Documentation/trace/events-power.rst +++ b/Documentation/trace/events-power.rst @@ -15,11 +15,11 @@ might be useful. Cf. include/trace/events/power.h for the events definitions. -1. Power state switch events +Power state switch events ============================ -1.1 Trace API ------------------ +Trace API +--------- A 'cpu' event class gathers the CPU-related events: cpuidle and cpufreq. @@ -45,8 +45,8 @@ The event which has 'state=4294967295' in the trace is very important to the use space tools which are using it to detect the end of the current state, and so to correctly draw the states diagrams and to calculate accurate statistics etc. -2. Clocks events -================ +Clocks events +============= The clock events are used for clock enable/disable and for clock rate change. :: @@ -59,8 +59,8 @@ The first parameter gives the clock name (e.g. "gpio1_iclk"). The second parameter is '1' for enable, '0' for disable, the target clock rate for set_rate. -3. Power domains events -======================= +Power domains events +==================== The power domain events are used for power domains transitions :: @@ -69,8 +69,8 @@ The power domain events are used for power domains transitions The first parameter gives the power domain name (e.g. "mpu_pwrdm"). The second parameter is the power domain target state. -4. PM QoS events -================ +PM QoS events +============= The PM QoS events are used for QoS add/update/remove request and for target/flags update. :: diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst index e83d0d5378be..0d4f8f63236f 100644 --- a/Documentation/trace/events.rst +++ b/Documentation/trace/events.rst @@ -5,8 +5,8 @@ Event Tracing :Author: Theodore Ts'o :Updated: Li Zefan and Tom Zanussi -1. Introduction -=============== +Introduction +============ Tracepoints (see Documentation/trace/tracepoints.rst) can be used without creating custom kernel modules to register probe functions @@ -17,13 +17,13 @@ the kernel developer must provide code snippets which define how the tracing information is saved into the tracing buffer, and how the tracing information should be printed. -2. Using Event Tracing -====================== +Using Event Tracing +=================== .. _tracing_set_event_interface: -2.1 Via the 'set_event' interface ---------------------------------- +Via the 'set_event' interface +----------------------------- The events which are available for tracing can be found in the file /sys/kernel/debug/tracing/available_events. @@ -57,8 +57,8 @@ command:: # echo 'irq:*' > /sys/kernel/debug/tracing/set_event -2.2 Via the 'enable' toggle ---------------------------- +Via the 'enable' toggle +----------------------- The events available are also listed in /sys/kernel/debug/tracing/events/ hierarchy of directories. @@ -86,8 +86,8 @@ When reading one of these enable files, there are four results: - X - there is a mixture of events enabled and disabled - ? - this file does not affect any event -2.3 Boot option ---------------- +Boot option +----------- In order to facilitate early boot debugging, use boot option:: @@ -96,15 +96,15 @@ In order to facilitate early boot debugging, use boot option:: event-list is a comma separated list of events. See :ref:`tracing_set_event_interface` for event format. -3. Defining an event-enabled tracepoint -======================================= +Defining an event-enabled tracepoint +==================================== See The example provided in samples/trace_events .. _tracing_event_formats: -4. Event formats -================ +Event formats +============= Each trace event has a 'format' file associated with it that contains a description of each field in a logged event. This information can @@ -156,8 +156,8 @@ event-specific. All the fields for this event are numeric, except for .. _tracing_event_filters: -5. Event filtering -================== +Event filtering +=============== Trace events can be filtered in the kernel by associating boolean 'filter expressions' with them. As soon as an event is logged into @@ -168,8 +168,8 @@ values don't match will be discarded. An event with no filter associated with it matches everything, and is the default when no filter has been set for an event. -5.1 Expression syntax ---------------------- +Expression syntax +----------------- A filter expression consists of one or more 'predicates' that can be combined using the logical operators '&&' and '||'. A predicate is @@ -213,8 +213,8 @@ field name:: As the kernel will have to know how to retrieve the memory that the pointer is at from user space. -5.2 Setting filters -------------------- +Setting filters +--------------- A filter for an individual event is set by writing a filter expression to the 'filter' file for the given event. @@ -245,8 +245,8 @@ Currently the caret ('^') for an error always appears at the beginning of the filter string; the error message should still be useful though even without more accurate position info. -5.2.1 Filter limitations ------------------------- +Filter limitations +------------------ If a filter is placed on a string pointer ``(char *)`` that does not point to a string on the ring buffer, but instead points to kernel or user space @@ -255,8 +255,8 @@ copied onto a temporary buffer to do the compare. If the copy of the memory faults (the pointer points to memory that should not be accessed), then the string compare will be treated as not matching. -5.3 Clearing filters --------------------- +Clearing filters +---------------- To clear the filter for an event, write a '0' to the event's filter file. @@ -264,8 +264,8 @@ file. To clear the filters for all events in a subsystem, write a '0' to the subsystem's filter file. -5.3 Subsystem filters ---------------------- +Subsystem filters +----------------- For convenience, filters for every event in a subsystem can be set or cleared as a group by writing a filter expression into the filter file @@ -311,8 +311,8 @@ their old filters):: # cat sched_wakeup/filter common_pid == 0 -5.4 PID filtering ------------------ +PID filtering +------------- The set_event_pid file in the same directory as the top events directory exists, will filter all events from tracing any task that does not have the @@ -332,8 +332,8 @@ To add more PIDs without losing the PIDs already included, use '>>'. .. _tracing_event_triggers: -6. Event triggers -================= +Event triggers +============== Trace events can be made to conditionally invoke trigger 'commands' which can take various forms and are described in detail below; @@ -373,8 +373,8 @@ way, so beware about making generalizations between the two. can also enable triggers that are written into /sys/kernel/tracing/events/ftrace/print/trigger -6.1 Expression syntax ---------------------- +Expression syntax +----------------- Triggers are added by echoing the command to the 'trigger' file:: @@ -397,8 +397,8 @@ adds or removes a single trigger and there's no explicit '>>' support ('>' actually behaves like '>>') or truncation support to remove all triggers (you have to use '!' for each one added.) -6.2 Supported trigger commands ------------------------------- +Supported trigger commands +-------------------------- The following commands are supported: @@ -553,8 +553,8 @@ The following commands are supported: See Documentation/trace/histogram.rst for details and examples. -7. In-kernel trace event API -============================ +In-kernel trace event API +========================= In most cases, the command-line interface to trace events is more than sufficient. Sometimes, however, applications might find the need for @@ -586,8 +586,8 @@ following: - tracing synthetic events from in-kernel code - the low-level "dynevent_cmd" API -7.1 Dyamically creating synthetic event definitions ---------------------------------------------------- +Dyamically creating synthetic event definitions +----------------------------------------------- There are a couple ways to create a new synthetic event from a kernel module or other kernel code. @@ -703,8 +703,8 @@ registered by calling the synth_event_gen_cmd_end() function:: At this point, the event object is ready to be used for tracing new events. -7.2 Tracing synthetic events from in-kernel code ------------------------------------------------- +Tracing synthetic events from in-kernel code +-------------------------------------------- To trace a synthetic event, there are several options. The first option is to trace the event in one call, using synth_event_trace() @@ -715,8 +715,8 @@ synth_event_trace_start() and synth_event_trace_end() along with synth_event_add_next_val() or synth_event_add_val() to add the values piecewise. -7.2.1 Tracing a synthetic event all at once -------------------------------------------- +Tracing a synthetic event all at once +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To trace a synthetic event all at once, the synth_event_trace() or synth_event_trace_array() functions can be used. @@ -817,8 +817,8 @@ remove the event:: ret = synth_event_delete("schedtest"); -7.2.2 Tracing a synthetic event piecewise ------------------------------------------ +Tracing a synthetic event piecewise +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To trace a synthetic using the piecewise method described above, the synth_event_trace_start() function is used to 'open' the synthetic @@ -901,8 +901,8 @@ Note that synth_event_trace_end() must be called at the end regardless of whether any of the add calls failed (say due to a bad field name being passed in). -7.3 Dyamically creating kprobe and kretprobe event definitions --------------------------------------------------------------- +Dyamically creating kprobe and kretprobe event definitions +---------------------------------------------------------- To create a kprobe or kretprobe trace event from kernel code, the kprobe_event_gen_cmd_start() or kretprobe_event_gen_cmd_start() @@ -978,8 +978,8 @@ used to give the kprobe event file back and delete the event:: ret = kprobe_event_delete("gen_kprobe_test"); -7.4 The "dynevent_cmd" low-level API ------------------------------------- +The "dynevent_cmd" low-level API +-------------------------------- Both the in-kernel synthetic event and kprobe interfaces are built on top of a lower-level "dynevent_cmd" interface. This interface is diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index d018bd332200..7f9ec00d5748 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -2453,7 +2453,7 @@ Or this simple script! function graph tracer ---------------------------- +--------------------- This tracer is similar to the function tracer except that it probes a function on its entry and its exit. This is done by diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index 859fd1b76c63..be1dcc7f84c4 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -4,16 +4,16 @@ Event Histograms Documentation written by Tom Zanussi -1. Introduction -=============== +Introduction +============ Histogram triggers are special event triggers that can be used to aggregate trace event data into histograms. For information on trace events and event triggers, see Documentation/trace/events.rst. -2. Histogram Trigger Command -============================ +Histogram Trigger Command +========================= A histogram trigger command is an event trigger command that aggregates event hits into a hash table keyed on one or more trace @@ -203,8 +203,8 @@ Extended error information tracing/error_log file. See Error Conditions in :file:`Documentation/trace/ftrace.rst` for details. -6.2 'hist' trigger examples ---------------------------- +'hist' trigger examples +----------------------- The first set of examples creates aggregations using the kmalloc event. The fields that can be used for the hist trigger are listed @@ -1599,8 +1599,8 @@ Extended error information Entries: 7 Dropped: 0 -2.2 Inter-event hist triggers ------------------------------ +Inter-event hist triggers +------------------------- Inter-event hist triggers are hist triggers that combine values from one or more other events and create a histogram using that data. Data @@ -1676,8 +1676,8 @@ pseudo-file. These features are described in more detail in the following sections. -2.2.1 Histogram Variables -------------------------- +Histogram Variables +------------------- Variables are simply named locations used for saving and retrieving values between matching events. A 'matching' event is defined as an @@ -1778,8 +1778,8 @@ or assigned to a variable and referenced in a subsequent expression:: # echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger # echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger -2.2.2 Synthetic Events ----------------------- +Synthetic Events +---------------- Synthetic events are user-defined events generated from hist trigger variables or fields associated with one or more other events. Their @@ -1932,8 +1932,8 @@ the ".buckets" modifier and specify a size (in this case groups of 10). Entries: 16 Dropped: 0 -2.2.3 Hist trigger 'handlers' and 'actions' -------------------------------------------- +Hist trigger 'handlers' and 'actions' +------------------------------------- A hist trigger 'action' is a function that's executed (in most cases conditionally) whenever a histogram entry is added or updated. @@ -2364,8 +2364,8 @@ The following commonly-used handler.action pairs are available: kworker/3:2-135 [003] d..3 49.823123: sched_switch: prev_comm=kworker/3:2 prev_pid=135 prev_prio=120 prev_state=T ==> next_comm=swapper/3 next_pid=0 next_prio=120 <idle>-0 [004] ..s7 49.823798: tcp_probe: src=10.0.0.10:54326 dest=23.215.104.193:80 mark=0x0 length=32 snd_nxt=0xe3ae2ff5 snd_una=0xe3ae2ecd snd_cwnd=10 ssthresh=2147483647 snd_wnd=28960 srtt=19604 rcv_wnd=29312 -3. User space creating a trigger --------------------------------- +User space creating a trigger +----------------------------- Writing into /sys/kernel/tracing/trace_marker writes into the ftrace ring buffer. This can also act like an event, by writing into the trigger diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst index 8c903e39bdf2..fd5c2ef794fa 100644 --- a/Documentation/trace/kprobetrace.rst +++ b/Documentation/trace/kprobetrace.rst @@ -42,7 +42,8 @@ Synopsis of kprobe_events MEMADDR : Address where the probe is inserted. MAXACTIVE : Maximum number of instances of the specified function that can be probed simultaneously, or 0 for the default value - as defined in Documentation/trace/kprobes.rst section 1.3.1. + as defined in Documentation/trace/kprobes.rst section + "How Does a Return Probe Work?" FETCHARGS : Arguments. Each probe can have up to 128 args. %REG : Fetch register REG diff --git a/Documentation/trace/tracepoint-analysis.rst b/Documentation/trace/tracepoint-analysis.rst index 716326b9f152..715896bf5f23 100644 --- a/Documentation/trace/tracepoint-analysis.rst +++ b/Documentation/trace/tracepoint-analysis.rst @@ -3,8 +3,8 @@ Notes on Analysing Behaviour Using Events and Tracepoints ========================================================= :Author: Mel Gorman (PCL information heavily based on email from Ingo Molnar) -1. Introduction -=============== +Introduction +============ Tracepoints (see Documentation/trace/tracepoints.rst) can be used without creating custom kernel modules to register probe functions using the event @@ -20,11 +20,11 @@ This document assumes that debugfs is mounted on /sys/kernel/debug and that the appropriate tracing options have been configured into the kernel. It is assumed that the PCL tool tools/perf has been installed and is in your path. -2. Listing Available Events -=========================== +Listing Available Events +======================== -2.1 Standard Utilities ----------------------- +Standard Utilities +------------------ All possible events are visible from /sys/kernel/debug/tracing/events. Simply calling:: @@ -33,8 +33,8 @@ calling:: will give a fair indication of the number of events available. -2.2 PCL (Performance Counters for Linux) ----------------------------------------- +PCL (Performance Counters for Linux) +------------------------------------ Discovery and enumeration of all counters and events, including tracepoints, are available with the perf tool. Getting a list of available events is a @@ -49,11 +49,11 @@ simple case of:: [ .... remaining output snipped .... ] -3. Enabling Events -================== +Enabling Events +=============== -3.1 System-Wide Event Enabling ------------------------------- +System-Wide Event Enabling +-------------------------- See Documentation/trace/events.rst for a proper description on how events can be enabled system-wide. A short example of enabling all events related @@ -61,8 +61,8 @@ to page allocation would look something like:: $ for i in `find /sys/kernel/debug/tracing/events -name "enable" | grep mm_`; do echo 1 > $i; done -3.2 System-Wide Event Enabling with SystemTap ---------------------------------------------- +System-Wide Event Enabling with SystemTap +----------------------------------------- In SystemTap, tracepoints are accessible using the kernel.trace() function call. The following is an example that reports every 5 seconds what processes @@ -87,8 +87,8 @@ were allocating the pages. print_count() } -3.3 System-Wide Event Enabling with PCL ---------------------------------------- +System-Wide Event Enabling with PCL +----------------------------------- By specifying the -a switch and analysing sleep, the system-wide events for a duration of time can be examined. @@ -109,14 +109,14 @@ for a duration of time can be examined. Similarly, one could execute a shell and exit it as desired to get a report at that point. -3.4 Local Event Enabling ------------------------- +Local Event Enabling +-------------------- Documentation/trace/ftrace.rst describes how to enable events on a per-thread basis using set_ftrace_pid. -3.5 Local Event Enablement with PCL ------------------------------------ +Local Event Enablement with PCL +------------------------------- Events can be activated and tracked for the duration of a process on a local basis using PCL such as follows. @@ -134,15 +134,15 @@ basis using PCL such as follows. 0.973913387 seconds time elapsed -4. Event Filtering -================== +Event Filtering +=============== Documentation/trace/ftrace.rst covers in-depth how to filter events in ftrace. Obviously using grep and awk of trace_pipe is an option as well as any script reading trace_pipe. -5. Analysing Event Variances with PCL -===================================== +Analysing Event Variances with PCL +================================== Any workload can exhibit variances between runs and it can be important to know what the standard deviation is. By and large, this is left to the @@ -185,8 +185,8 @@ time on a system-wide basis using -a and sleep. 1.002251757 seconds time elapsed ( +- 0.005% ) -6. Higher-Level Analysis with Helper Scripts -============================================ +Higher-Level Analysis with Helper Scripts +========================================= When events are enabled the events that are triggering can be read from /sys/kernel/debug/tracing/trace_pipe in human-readable format although binary @@ -217,8 +217,8 @@ also can do more such as processes, the parent process responsible for creating all the helpers can be identified -7. Lower-Level Analysis with PCL -================================ +Lower-Level Analysis with PCL +============================= There may also be a requirement to identify what functions within a program were generating events within the kernel. To begin this sort of analysis, the -- 2.30.2