On 12/22/2023 2:11 AM, Yang Shi wrote:
On Thu, Dec 21, 2023 at 5:40 AM Yin, Fengwei <fengwei.yin@xxxxxxxxx> wrote:
On 12/21/2023 8:58 AM, Yin Fengwei wrote:
But what I am not sure was whether it's worthy to do such kind of change
as the regression only is seen obviously in micro-benchmark. No evidence
showed the other regressionsin this report is related with madvise. At
least from the perf statstics. Need to check more on stream/ramspeed.
Thanks.
With debugging patch (filter out the stack mapping from THP aligned),
the result of stream can be restored to around 2%:
commit:
30749e6fbb3d391a7939ac347e9612afe8c26e94
1111d46b5cbad57486e7a3fab75888accac2f072
89f60532d82b9ecd39303a74589f76e4758f176f -> 1111d46b5cbad with
debugging patch
30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75 89f60532d82b9ecd39303a74589
---------------- --------------------------- ---------------------------
350993 -15.6% 296081 ± 2% -1.5% 345689
stream.add_bandwidth_MBps
349830 -16.1% 293492 ± 2% -2.3% 341860 ±
2% stream.add_bandwidth_MBps_harmonicMean
333973 -20.5% 265439 ± 3% -1.7% 328403
stream.copy_bandwidth_MBps
332930 -21.7% 260548 ± 3% -2.5% 324711 ±
2% stream.copy_bandwidth_MBps_harmonicMean
302788 -16.2% 253817 ± 2% -1.4% 298421
stream.scale_bandwidth_MBps
302157 -17.1% 250577 ± 2% -2.0% 296054
stream.scale_bandwidth_MBps_harmonicMean
339047 -12.1% 298061 -1.4% 334206
stream.triad_bandwidth_MBps
338186 -12.4% 296218 -2.0% 331469
stream.triad_bandwidth_MBps_harmonicMean
The regression of ramspeed is still there.
Thanks for the debugging patch and the test. If no one has objection
to honor MAP_STACK, I'm going to come up with a more formal patch.
Even though thp_get_unmapped_area() is not called for MAP_STACK, stack
area still may be allocated at 2M aligned address theoretically. And
it may be worse with multi-sized THP, for 1M.
Right. Filtering out MAP_STACK can't make sure no THP for stack. Just
reduce the possibility of using THP for stack.
Do you have any instructions regarding how to run ramspeed? Anyway I
may not have time debug it until after holidays.
0Day leverages phoronix-test-suite to run ramspeed. So I don't have
direct answer here.
I suppose we could check the configuration of ramspeed in phoronix-test-
suite to understand what's the build options and command options to run
ramspeed:
https://openbenchmarking.org/test/pts/ramspeed
Regards
Yin, Fengwei
Regards
Yin, Fengwei