Patch "tools/perf: Update call stack check in builtin-lock.c" has been added to the 6.6-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    tools/perf: Update call stack check in builtin-lock.c

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     tools-perf-update-call-stack-check-in-builtin-lock.c.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit c92192982ed51f8aa9d3cd88c7de5a1471728923
Author: Kajol Jain <kjain@xxxxxxxxxxxxx>
Date:   Tue Oct 3 14:51:13 2023 +0530

    tools/perf: Update call stack check in builtin-lock.c
    
    [ Upstream commit d7c9ae8d5d1be0c4156d1e20e4369a77b711a4cc ]
    
    The perf test named "kernel lock contention analysis test"
    fails in powerpc system with below error:
    
      [command]# ./perf test 81 -vv
       81: kernel lock contention analysis test                            :
       --- start ---
      test child forked, pid 2140
      Testing perf lock record and perf lock contention
      Testing perf lock contention --use-bpf
      [Skip] No BPF support
      Testing perf lock record and perf lock contention at the same time
      Testing perf lock contention --threads
      Testing perf lock contention --lock-addr
      Testing perf lock contention --type-filter (w/ spinlock)
      Testing perf lock contention --lock-filter (w/ tasklist_lock)
      Testing perf lock contention --callstack-filter (w/ unix_stream)
      [Fail] Recorded result should have a lock from unix_stream:
      test child finished with -1
       ---- end ----
      kernel lock contention analysis test: FAILED!
    
    The test is failing because we get an address entry with 0 in
    perf lock samples for powerpc, and code for lock contention
    option "--callstack-filter" will not check further entries after
    address 0.
    
    Below are some of the samples from test generated perf.data file, which
    have 0 address in the 2nd entry of callstack:
     --------
    sched-messaging    3409 [001]  7152.904029: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN)
            c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
                           0 [unknown] ([unknown])
            c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms])
            c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms])
            c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms])
            c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms])
            c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms])
            c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])
    
    sched-messaging    3408 [005]  7152.904036: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN)
            c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
                           0 [unknown] ([unknown])
            c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms])
            c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms])
            c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms])
            c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms])
            c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms])
            c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])
     --------
    
    Based on commit 20002ded4d93 ("perf_counter: powerpc: Add callchain support"),
    incase of powerpc, the callchain saved by kernel always includes first
    three entries as the NIP (next instruction pointer), LR (link register), and
    the contents of LR save area in the second stack frame. In certain scenarios
    its possible to have invalid kernel instruction addresses in either of LR or the
    second stack frame's LR. In that case, kernel will store the address as zer0.
    Hence, its possible to have 2nd or 3rd callstack entry as 0.
    
    As per the current code in match_callstack_filter function, we skip the callstack
    check incase we get 0 address. And hence the test case is failing in powerpc.
    
    Fix this issue by updating the check in match_callstack_filter function,
    to not skip callstack check if the 2nd or 3rd entry have 0 address
    for powerpc.
    
    Result in powerpc after patch changes:
    
      [command]# ./perf test 81 -vv
       81: kernel lock contention analysis test                            :
       --- start ---
      test child forked, pid 4570
      Testing perf lock record and perf lock contention
      Testing perf lock contention --use-bpf
      [Skip] No BPF support
      Testing perf lock record and perf lock contention at the same time
      Testing perf lock contention --threads
      Testing perf lock contention --lock-addr
      Testing perf lock contention --type-filter (w/ spinlock)
      Testing perf lock contention --lock-filter (w/ tasklist_lock)
      [Skip] Could not find 'tasklist_lock'
      Testing perf lock contention --callstack-filter (w/ unix_stream)
      Testing perf lock contention --callstack-filter with task aggregation
      Testing perf lock contention CSV output
      [Skip] No BPF support
      test child finished with 0
       ---- end ----
      kernel lock contention analysis test: Ok
    
    Fixes: ebab291641be ("perf lock contention: Support filters for different aggregation")
    Reported-by: Disha Goel <disgoel@xxxxxxxxxxxxxxxxxx>
    Tested-by: Disha Goel <disgoel@xxxxxxxxxxxxx>
    Signed-off-by: Kajol Jain <kjain@xxxxxxxxxxxxx>
    Cc: maddy@xxxxxxxxxxxxx
    Cc: atrajeev@xxxxxxxxxxxxxxxxxx
    Link: https://lore.kernel.org/r/20231003092113.252380-1-kjain@xxxxxxxxxxxxx
    Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index b141f21342740..0b4b4445c5207 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -524,6 +524,7 @@ bool match_callstack_filter(struct machine *machine, u64 *callstack)
 	struct map *kmap;
 	struct symbol *sym;
 	u64 ip;
+	const char *arch = perf_env__arch(machine->env);
 
 	if (list_empty(&callstack_filters))
 		return true;
@@ -531,7 +532,21 @@ bool match_callstack_filter(struct machine *machine, u64 *callstack)
 	for (int i = 0; i < max_stack_depth; i++) {
 		struct callstack_filter *filter;
 
-		if (!callstack || !callstack[i])
+		/*
+		 * In powerpc, the callchain saved by kernel always includes
+		 * first three entries as the NIP (next instruction pointer),
+		 * LR (link register), and the contents of LR save area in the
+		 * second stack frame. In certain scenarios its possible to have
+		 * invalid kernel instruction addresses in either LR or the second
+		 * stack frame's LR. In that case, kernel will store that address as
+		 * zero.
+		 *
+		 * The below check will continue to look into callstack,
+		 * incase first or second callstack index entry has 0
+		 * address for powerpc.
+		 */
+		if (!callstack || (!callstack[i] && (strcmp(arch, "powerpc") ||
+						(i != 1 && i != 2))))
 			break;
 
 		ip = callstack[i];



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux