Re: [linus:master] [sched] af7f588d8f: will-it-scale.per_thread_ops -13.9% regression

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Mon, 15 May 2023 10:40:07 +0200



On Mon, May 15, 2023 at 03:00:44PM +0800, kernel test robot wrote:
> Hello,
> 
> kernel test robot noticed a -13.9% regression of will-it-scale.per_thread_ops on:
> 
> commit: af7f588d8f7355bc4298dd1962d7826358fc95f0 ("sched: Introduce per-memory-map concurrency ID")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> testcase: will-it-scale
> test machine: 224 threads 2 sockets (Sapphire Rapids) with 256G memory
> parameters:
> 
> 	test: context_switch1
> 	cpufreq_governor: performance
> 
> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
> test-url: https://github.com/antonblanchard/will-it-scale
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+----------------------------------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_thread_ops -73.8% regression                                      |
> | test machine     | 224 threads 2 sockets (Sapphire Rapids) with 256G memory                                           |
> | test parameters  | cpufreq_governor=performance                                                                       |
> |                  | mode=thread                                                                                        |
> |                  | nr_task=16                                                                                         |
> |                  | test=context_switch1                                                                               |
> +------------------+----------------------------------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_thread_ops -57.9% regression                                      |
> | test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                   |
> | test parameters  | cpufreq_governor=performance                                                                       |
> |                  | mode=thread                                                                                        |
> |                  | nr_task=16                                                                                         |
> |                  | test=context_switch1                                                                               |
> +------------------+----------------------------------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_thread_ops -85.0% regression                                      |
> | test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                   |
> | test parameters  | cpufreq_governor=performance                                                                       |
> |                  | mode=thread                                                                                        |
> |                  | nr_task=50%                                                                                        |
> |                  | test=context_switch1                                                                               |
> +------------------+----------------------------------------------------------------------------------------------------+
> | testcase: change | vm-scalability: vm-scalability.throughput -9.0% regression                                         |
> | test machine     | 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory |
> | test parameters  | cpufreq_governor=performance                                                                       |
> |                  | runtime=300s                                                                                       |
> |                  | size=2T                                                                                            |
> |                  | test=shm-xread-seq-mt                                                                              |
> +------------------+----------------------------------------------------------------------------------------------------+
> 
> FYI, we noticed that commit 223baf9d17f2 (sched: Fix performance
> regression introduced by mm_cid) fixed a sysbench regression, but
> will-it-scale context_switch1 benchmark still saw a regression on this
> fix commit.
> 
> Furthermore, we applied the code diff in below link [1] on mainline, and
> the will-it-scale score was restored to the original level before this
> patch.
> 
> [1] https://lore.kernel.org/lkml/d96164a6-c522-1bfc-8b37-333726cdc573@xxxxxxxxxxxx/
> 

Right; so I'm thinking we can do that patch -- I'll try and get the
whole lazy TLB thing sorted, but I'm not sure I can find the piece and
quiet to think that over in a hurry :/