Re: [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1)

Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> · Thu, 6 Apr 2023 21:41:28 -0300

Em Thu, Apr 06, 2023 at 02:06:04PM -0700, Namhyung Kim escreveu:
> Hello,
> 
> I got a report that the overhead of perf lock contention is too big in
> some cases.  It was running the task aggregation mode (-t) at the moment
> and there were lots of tasks contending each other.
> 
> It turned out that the hash map update is a problem.  The result is saved
> in the lock_stat hash map which is pre-allocated.  The BPF program never
> deletes data in the map, but just adds.  But if the map is full, (try to)
> update the map becomes a very heavy operation - since it needs to check
> every CPU's freelist to get a new node to save the result.  But we know
> it'd fail when the map is full.  No need to update then.

Thanks, applied.

- Arnaldo

> I've checked it on my 64 CPU machine with this.
> 
>     $ perf bench sched messaging -g 1000
>     # Running 'sched/messaging' benchmark:
>     # 20 sender and receiver processes per group
>     # 1000 groups == 40000 processes run
> 
>          Total time: 2.825 [sec]
> 
> And I used the task mode, so that it can guarantee the map is full.
> The default map entry size is 16K and this workload has 40K tasks.
> 
> Before:
>     $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
>     # Running 'sched/messaging' benchmark:
>     # 20 sender and receiver processes per group
>     # 1000 groups == 40000 processes run
> 
>          Total time: 11.299 [sec]
>      contended   total wait     max wait     avg wait          pid   comm
> 
>          19284      3.51 s       3.70 ms    181.91 us      1305863   sched-messaging
>            243     84.09 ms    466.67 us    346.04 us      1336608   sched-messaging
>            177     66.35 ms     12.08 ms    374.88 us      1220416   node
> 
> After:
>     $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
>     # Running 'sched/messaging' benchmark:
>     # 20 sender and receiver processes per group
>     # 1000 groups == 40000 processes run
> 
>          Total time: 3.044 [sec]
>      contended   total wait     max wait     avg wait          pid   comm
> 
>          18743    591.92 ms    442.96 us     31.58 us      1431454   sched-messaging
>             51    210.64 ms    207.45 ms      4.13 ms      1468724   sched-messaging
>             81     68.61 ms     65.79 ms    847.07 us      1463183   sched-messaging
> 
>     === output for debug ===
> 
>     bad: 1164137, total: 2253341
>     bad rate: 51.66 %
>     histogram of failure reasons
>            task: 0
>           stack: 0
>            time: 0
>            data: 1164137
> 
> The first few patches are small cleanups and fixes.  You can get the code
> from 'perf/lock-map-v1' branch in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks,
> Namhyung
> 
> Namhyung Kim (7):
>   perf lock contention: Simplify parse_lock_type()
>   perf lock contention: Use -M for --map-nr-entries
>   perf lock contention: Update default map size to 16384
>   perf lock contention: Add data failure stat
>   perf lock contention: Update total/bad stats for hidden entries
>   perf lock contention: Revise needs_callstack() condition
>   perf lock contention: Do not try to update if hash map is full
> 
>  tools/perf/Documentation/perf-lock.txt        |  4 +-
>  tools/perf/builtin-lock.c                     | 64 ++++++++-----------
>  tools/perf/util/bpf_lock_contention.c         |  7 +-
>  .../perf/util/bpf_skel/lock_contention.bpf.c  | 29 +++++++--
>  tools/perf/util/bpf_skel/lock_data.h          |  3 +
>  tools/perf/util/lock-contention.h             |  2 +
>  6 files changed, 60 insertions(+), 49 deletions(-)
> 
> 
> base-commit: e5116f46d44b72ede59a6923829f68a8b8f84e76
> -- 
> 2.40.0.577.gac1e443424-goog
> 

-- 

- Arnaldo