Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 12/22/2023 2:11 AM, Yang Shi wrote:
On Thu, Dec 21, 2023 at 5:40 AM Yin, Fengwei <fengwei.yin@xxxxxxxxx> wrote:



On 12/21/2023 8:58 AM, Yin Fengwei wrote:
But what I am not sure was whether it's worthy to do such kind of change
as the regression only is seen obviously in micro-benchmark. No evidence
showed the other regressionsin this report is related with madvise. At
least from the perf statstics. Need to check more on stream/ramspeed.
Thanks.

With debugging patch (filter out the stack mapping from THP aligned),
the result of stream can be restored to around 2%:

commit:
    30749e6fbb3d391a7939ac347e9612afe8c26e94
    1111d46b5cbad57486e7a3fab75888accac2f072
    89f60532d82b9ecd39303a74589f76e4758f176f  -> 1111d46b5cbad with
debugging patch

30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75 89f60532d82b9ecd39303a74589
---------------- --------------------------- ---------------------------
      350993           -15.6%     296081 ±  2%      -1.5%     345689
    stream.add_bandwidth_MBps
      349830           -16.1%     293492 ±  2%      -2.3%     341860 ±
2%  stream.add_bandwidth_MBps_harmonicMean
      333973           -20.5%     265439 ±  3%      -1.7%     328403
    stream.copy_bandwidth_MBps
      332930           -21.7%     260548 ±  3%      -2.5%     324711 ±
2%  stream.copy_bandwidth_MBps_harmonicMean
      302788           -16.2%     253817 ±  2%      -1.4%     298421
    stream.scale_bandwidth_MBps
      302157           -17.1%     250577 ±  2%      -2.0%     296054
    stream.scale_bandwidth_MBps_harmonicMean
      339047           -12.1%     298061            -1.4%     334206
    stream.triad_bandwidth_MBps
      338186           -12.4%     296218            -2.0%     331469
    stream.triad_bandwidth_MBps_harmonicMean


The regression of ramspeed is still there.

Thanks for the debugging patch and the test. If no one has objection
to honor MAP_STACK, I'm going to come up with a more formal patch.
Even though thp_get_unmapped_area() is not called for MAP_STACK, stack
area still may be allocated at 2M aligned address theoretically. And
it may be worse with multi-sized THP, for 1M.
Right. Filtering out MAP_STACK can't make sure no THP for stack. Just
reduce the possibility of using THP for stack.


Do you have any instructions regarding how to run ramspeed? Anyway I
may not have time debug it until after holidays.
0Day leverages phoronix-test-suite to run ramspeed. So I don't have
direct answer here.

I suppose we could check the configuration of ramspeed in phoronix-test-
suite to understand what's the build options and command options to run
ramspeed:
https://openbenchmarking.org/test/pts/ramspeed


Regards
Yin, Fengwei




Regards
Yin, Fengwei




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux