Patch "perf stat: Fix forked applications enablement of counters" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    perf stat: Fix forked applications enablement of counters

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     perf-stat-fix-forked-applications-enablement-of-coun.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 03f1926c78049032d92f72288ab2f95baa9bd335
Author: Thomas Richter <tmricht@xxxxxxxxxxxxx>
Date:   Thu Mar 17 16:53:46 2022 +0100

    perf stat: Fix forked applications enablement of counters
    
    [ Upstream commit d0a0a511493d269514fcbd852481cdca32c95350 ]
    
    I have run into the following issue:
    
     # perf stat -a -e new_pmu/INSTRUCTION_7/ --  mytest -c1 7
    
     Performance counter stats for 'system wide':
    
                     0      new_pmu/INSTRUCTION_7/
    
           0.000366428 seconds time elapsed
     #
    
    The new PMU for s390 counts the execution of certain CPU instructions.
    The root cause is the extremely small run time of the mytest program. It
    just executes some assembly instructions and then exits.
    
    In above invocation the instruction is executed exactly one time (-c1
    option). The PMU is expected to report this one time execution by a
    counter value of one, but fails to do so in some cases, not all.
    
    Debugging reveals the invocation of the child process is done
    *before* the counter events are installed and enabled.
    
    Tracing reveals that sometimes the child process starts and exits before
    the event is installed on all CPUs. The more CPUs the machine has, the
    more often this miscount happens.
    
    Fix this by reversing the start of the work load after the events have
    been installed on the specified CPUs. Now the comment also matches the
    code.
    
    Output after:
    
     # perf stat -a -e new_pmu/INSTRUCTION_7/ --  mytest -c1 7
    
     Performance counter stats for 'system wide':
    
                     1      new_pmu/INSTRUCTION_7/
    
           0.000366428 seconds time elapsed
     #
    
    Now the correct result is reported rock solid all the time regardless
    how many CPUs are online.
    
    Reviewers notes:
    
    Jiri:
    
    Right, without -a the event has enable_on_exec so the race does not
    matter, but it's a problem for system wide with fork.
    
    Namhyung:
    
    Agreed. Also we may move the enable_counters() and the clock code out of
    the if block to be shared with the else block.
    
    Fixes: acf2892270dcc428 ("perf stat: Use perf_evlist__prepare/start_workload()")
    Signed-off-by: Thomas Richter <tmricht@xxxxxxxxxxxxx>
    Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>
    Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>
    Acked-by: Sumanth Korikkar <sumanthk@xxxxxxxxxxxxx>
    Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
    Cc: Sven Schnelle <svens@xxxxxxxxxxxxx>
    Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
    Link: https://lore.kernel.org/r/20220317155346.577384-1-tmricht@xxxxxxxxxxxxx
    Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f0ecfda34ece..1a194edb5452 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -956,10 +956,10 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	 * Enable counters and exec the command:
 	 */
 	if (forks) {
-		evlist__start_workload(evsel_list);
 		err = enable_counters();
 		if (err)
 			return -1;
+		evlist__start_workload(evsel_list);
 
 		t0 = rdclock();
 		clock_gettime(CLOCK_MONOTONIC, &ref_time);



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux