how to use perf record effectively

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a working patchset which de-duplicates the pr_debug
per-callsite ( module, filename, function ) data.

it loads that column data into 3 maple-trees,
and simple accessor fns retrieve the data
by lookup with the pr-debug address.

So it stores these callsites:
[    0.721980] dyndbg: 3653 prdebugs in 307 modules, 19 KiB in ddebug
tables, 114 kiB ..

into these intervals:
[  104.047210] dyndbg: mt-funcs has 2174 entries
[  104.047816] dyndbg: mt-files has 539 entries
[  104.048410] dyndbg: mt-mods has 312 entries

once these are loaded, the __dyndbg_sites section,
which separates the 3 columns from the __dyndbg section,
can be recycled.contains

ALL GOOD SO FAR.
BUT WHATS THE RUNTIME COST OF THIS ?

perf stat -r200 cat /proc/dynamic_debug/control > /dev/null;

this should be a good test - it calls all 3 accessors for each
pr-debug in the kernel.

but comparing master against this branch shows little change,
and adding --table to see the variations in the runs
suggests that the change is less than the variation within a test.

MASTER - v6.6

bash-5.2# perf stat -r 200 cat /proc/dynamic_debug/control > /dev/null

 Performance counter stats for 'cat /proc/dynamic_debug/control' (200 runs):

             10.29 msec task-clock                       #    0.713
CPUs utilized               ( +-  0.56% )
                43      context-switches                 #    4.177
K/sec                       ( +-  0.03% )
                 1      cpu-migrations                   #   97.142
/sec                        ( +-  5.80% )
                73      page-faults                      #    7.091
K/sec                       ( +-  0.10% )
           8906200      cycles                           #    0.865
GHz                         ( +-  0.17% )
            147349      stalled-cycles-frontend          #    1.65%
frontend cycles idle        ( +-  0.18% )
             24971      stalled-cycles-backend           #    0.28%
backend cycles idle         ( +-  8.18% )
          20589718      instructions                     #    2.31
insn per cycle
                                                  #    0.01  stalled
cycles per insn     ( +-  0.02% )
           5470202      branches                         #  531.388
M/sec                       ( +-  0.01% )
                 0      branch-misses

         0.0144421 +- 0.0000647 seconds time elapsed  ( +-  0.45% )


DE_DUPLICATION branch

bash-5.2# perf stat -r200 cat /proc/dynamic_debug/control > /dev/null

 Performance counter stats for 'cat /proc/dynamic_debug/control' (200 runs):

             21.89 msec task-clock                       #    0.622
CPUs utilized               ( +-  0.69% )
                44      context-switches                 #    2.010
K/sec                       ( +-  0.12% )
                 1      cpu-migrations                   #   45.693
/sec                        ( +-  3.87% )
                73      page-faults                      #    3.336
K/sec                       ( +-  0.10% )
          52017542      cycles                           #    2.377
GHz                         ( +-  0.54% )
            177875      stalled-cycles-frontend          #    0.34%
frontend cycles idle        ( +-  0.48% )
            134469      stalled-cycles-backend           #    0.26%
backend cycles idle         ( +-  4.24% )
         134707837      instructions                     #    2.59
insn per cycle
                                                  #    0.00  stalled
cycles per insn     ( +-  0.30% )
          39386555      branches                         #    1.800
G/sec                       ( +-  0.29% )
                 0      branch-misses

          0.035188 +- 0.000167 seconds time elapsed  ( +-  0.47% )



I tried perf stat record, then perf-diff on the results,
it showed empty comparisons on a handful of event-types

[jimc@frodo boots-dump]$ perf diff -v
v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0*
v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675*
v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906* > foo
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
group desc not available
pmu capabilities not available
[jimc@frodo boots-dump]$ more foo
# Event 'task-clock'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'context-switches'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'cpu-migrations'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'page-faults'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'cycles'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'stalled-cycles-frontend'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'stalled-cycles-backend'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'instructions'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'branches'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#

# Event 'branch-misses'
#
# Data files:
#  [0] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0 (Baseline)
#  [1] v6.6-36-g73a29cb216c0/perf-rec-6.6.0-f2-00036-g73a29cb216c0-null
#  [2] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675
#  [3] v6.6-8-g067d1f1d8675/perf-rec-6.6.0-tf2-00008-g067d1f1d8675-null
#  [4] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906
#  [5] v6.6-23-g10f95252e906/perf-rec-6.6.0-tf2-00023-g10f95252e906-null
#
# Baseline/0  Delta Abs/1  Delta Abs/2  Delta Abs/3  Delta Abs/4
Delta Abs/5  Shared Object  Symbol
# ..........  ...........  ...........  ...........  ...........
...........  .............  ......
#



Does anyone here have enough experience with perf to recommend
some tests to tease out the differences ?

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]

  Powered by Linux