Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 21, 2023 at 5:40 AM Yin, Fengwei <fengwei.yin@xxxxxxxxx> wrote:
>
>
>
> On 12/21/2023 8:58 AM, Yin Fengwei wrote:
> > But what I am not sure was whether it's worthy to do such kind of change
> > as the regression only is seen obviously in micro-benchmark. No evidence
> > showed the other regressionsin this report is related with madvise. At
> > least from the perf statstics. Need to check more on stream/ramspeed.
> > Thanks.
>
> With debugging patch (filter out the stack mapping from THP aligned),
> the result of stream can be restored to around 2%:
>
> commit:
>    30749e6fbb3d391a7939ac347e9612afe8c26e94
>    1111d46b5cbad57486e7a3fab75888accac2f072
>    89f60532d82b9ecd39303a74589f76e4758f176f  -> 1111d46b5cbad with
> debugging patch
>
> 30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75 89f60532d82b9ecd39303a74589
> ---------------- --------------------------- ---------------------------
>      350993           -15.6%     296081 ±  2%      -1.5%     345689
>    stream.add_bandwidth_MBps
>      349830           -16.1%     293492 ±  2%      -2.3%     341860 ±
> 2%  stream.add_bandwidth_MBps_harmonicMean
>      333973           -20.5%     265439 ±  3%      -1.7%     328403
>    stream.copy_bandwidth_MBps
>      332930           -21.7%     260548 ±  3%      -2.5%     324711 ±
> 2%  stream.copy_bandwidth_MBps_harmonicMean
>      302788           -16.2%     253817 ±  2%      -1.4%     298421
>    stream.scale_bandwidth_MBps
>      302157           -17.1%     250577 ±  2%      -2.0%     296054
>    stream.scale_bandwidth_MBps_harmonicMean
>      339047           -12.1%     298061            -1.4%     334206
>    stream.triad_bandwidth_MBps
>      338186           -12.4%     296218            -2.0%     331469
>    stream.triad_bandwidth_MBps_harmonicMean
>
>
> The regression of ramspeed is still there.

Thanks for the debugging patch and the test. If no one has objection
to honor MAP_STACK, I'm going to come up with a more formal patch.
Even though thp_get_unmapped_area() is not called for MAP_STACK, stack
area still may be allocated at 2M aligned address theoretically. And
it may be worse with multi-sized THP, for 1M.

Do you have any instructions regarding how to run ramspeed? Anyway I
may not have time debug it until after holidays.

>
>
> Regards
> Yin, Fengwei





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux