Patch "perf report: Fix wrong LBR block sorting" has been added to the 5.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    perf report: Fix wrong LBR block sorting

to the 5.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     perf-report-fix-wrong-lbr-block-sorting.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit febaf7416cf6e90f181801117ccddb4b6d740c3b
Author: Jin Yao <yao.jin@xxxxxxxxxxxxxxx>
Date:   Wed Apr 7 10:44:52 2021 +0800

    perf report: Fix wrong LBR block sorting
    
    [ Upstream commit f2013278ae40b89cc27916366c407ce5261815ef ]
    
    When '--total-cycles' is specified, it supports sorting for all blocks
    by 'Sampled Cycles%'. This is useful to concentrate on the globally
    hottest blocks.
    
    'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
    
    But in current code, it doesn't use the cycles aggregation. Part of
    'cycles' counting is possibly dropped for some overlap jumps. But for
    identifying the hot block, we always need the full cycles.
    
      # perf record -b ./triad_loop
      # perf report --total-cycles --stdio
    
    Before:
    
      #
      # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                          [Program Block Range]      Shared Object
      # ...............  ..............  ...........  ..........  .............................................................  .................
      #
                  0.81%             793        4.32%         793                           [setup-vdso.h:34 -> setup-vdso.h:40]         ld-2.27.so
                  0.49%             480        0.87%         160                    [native_write_msr+0 -> native_write_msr+16]  [kernel.kallsyms]
                  0.48%             476        0.52%          95                      [native_read_msr+0 -> native_read_msr+29]  [kernel.kallsyms]
                  0.31%             303        1.65%         303                              [nmi_restore+0 -> nmi_restore+37]  [kernel.kallsyms]
                  0.26%             255        1.39%         255      [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162]  [kernel.kallsyms]
                  0.24%             234        1.28%         234                       [end_repeat_nmi+67 -> end_repeat_nmi+83]  [kernel.kallsyms]
                  0.23%             227        1.24%         227            [__irqentry_text_end+96 -> __irqentry_text_end+126]  [kernel.kallsyms]
                  0.20%             194        1.06%         194             [native_set_debugreg+52 -> native_set_debugreg+56]  [kernel.kallsyms]
                  0.11%             106        0.14%          26                [native_sched_clock+0 -> native_sched_clock+98]  [kernel.kallsyms]
                  0.10%              97        0.53%          97            [trigger_load_balance+0 -> trigger_load_balance+67]  [kernel.kallsyms]
                  0.09%              85        0.46%          85             [get-dynamic-info.h:102 -> get-dynamic-info.h:111]         ld-2.27.so
      ...
                  0.00%           92.7K        0.02%           4                           [triad_loop.c:64 -> triad_loop.c:65]         triad_loop
    
    The hottest block '[triad_loop.c:64 -> triad_loop.c:65]' is not at
    the top of output.
    
    After:
    
      # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                           [Program Block Range]      Shared Object
      # ...............  ..............  ...........  ..........  ..............................................................  .................
      #
                 94.35%           92.7K        0.02%           4                            [triad_loop.c:64 -> triad_loop.c:65]         triad_loop
                  0.81%             793        4.32%         793                            [setup-vdso.h:34 -> setup-vdso.h:40]         ld-2.27.so
                  0.49%             480        0.87%         160                     [native_write_msr+0 -> native_write_msr+16]  [kernel.kallsyms]
                  0.48%             476        0.52%          95                       [native_read_msr+0 -> native_read_msr+29]  [kernel.kallsyms]
                  0.31%             303        1.65%         303                               [nmi_restore+0 -> nmi_restore+37]  [kernel.kallsyms]
                  0.26%             255        1.39%         255       [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162]  [kernel.kallsyms]
                  0.24%             234        1.28%         234                        [end_repeat_nmi+67 -> end_repeat_nmi+83]  [kernel.kallsyms]
                  0.23%             227        1.24%         227             [__irqentry_text_end+96 -> __irqentry_text_end+126]  [kernel.kallsyms]
                  0.20%             194        1.06%         194              [native_set_debugreg+52 -> native_set_debugreg+56]  [kernel.kallsyms]
                  0.11%             106        0.14%          26                 [native_sched_clock+0 -> native_sched_clock+98]  [kernel.kallsyms]
                  0.10%              97        0.53%          97             [trigger_load_balance+0 -> trigger_load_balance+67]  [kernel.kallsyms]
                  0.09%              85        0.46%          85              [get-dynamic-info.h:102 -> get-dynamic-info.h:111]         ld-2.27.so
                  0.08%              82        0.06%          11  [intel_pmu_drain_pebs_nhm+580 -> intel_pmu_drain_pebs_nhm+627]  [kernel.kallsyms]
                  0.08%              77        0.42%          77                  [lru_add_drain_cpu+0 -> lru_add_drain_cpu+133]  [kernel.kallsyms]
                  0.08%              74        0.10%          18                [handle_pmi_common+271 -> handle_pmi_common+310]  [kernel.kallsyms]
                  0.08%              74        0.40%          74              [get-dynamic-info.h:131 -> get-dynamic-info.h:157]         ld-2.27.so
                  0.07%              69        0.09%          17  [intel_pmu_drain_pebs_nhm+432 -> intel_pmu_drain_pebs_nhm+468]  [kernel.kallsyms]
    
    Now the hottest block is reported at the top of output.
    
    Fixes: b65a7d372b1a55db ("perf hist: Support block formats with compare/sort/display")
    Signed-off-by: Jin Yao <yao.jin@xxxxxxxxxxxxxxx>
    Reviewed-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
    Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
    Cc: Jin Yao <yao.jin@xxxxxxxxx>
    Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
    Cc: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
    Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
    Link: http://lore.kernel.org/lkml/20210407024452.29988-1-yao.jin@xxxxxxxxxxxxxxx
    Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 423ec69bda6c..5ecd4f401f32 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -201,7 +201,7 @@ static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
 	double ratio = 0.0;
 
 	if (block_fmt->total_cycles)
-		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
+		ratio = (double)bi->cycles_aggr / (double)block_fmt->total_cycles;
 
 	return color_pct(hpp, block_fmt->width, 100.0 * ratio);
 }
@@ -216,9 +216,9 @@ static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
 	double l, r;
 
 	if (block_fmt->total_cycles) {
-		l = ((double)bi_l->cycles /
+		l = ((double)bi_l->cycles_aggr /
 			(double)block_fmt->total_cycles) * 100000.0;
-		r = ((double)bi_r->cycles /
+		r = ((double)bi_r->cycles_aggr /
 			(double)block_fmt->total_cycles) * 100000.0;
 		return (int64_t)l - (int64_t)r;
 	}



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux