On Thu, Dec 21, 2023 at 5:40 AM Yin, Fengwei <fengwei.yin@xxxxxxxxx> wrote: > > > > On 12/21/2023 8:58 AM, Yin Fengwei wrote: > > But what I am not sure was whether it's worthy to do such kind of change > > as the regression only is seen obviously in micro-benchmark. No evidence > > showed the other regressionsin this report is related with madvise. At > > least from the perf statstics. Need to check more on stream/ramspeed. > > Thanks. > > With debugging patch (filter out the stack mapping from THP aligned), > the result of stream can be restored to around 2%: > > commit: > 30749e6fbb3d391a7939ac347e9612afe8c26e94 > 1111d46b5cbad57486e7a3fab75888accac2f072 > 89f60532d82b9ecd39303a74589f76e4758f176f -> 1111d46b5cbad with > debugging patch > > 30749e6fbb3d391a 1111d46b5cbad57486e7a3fab75 89f60532d82b9ecd39303a74589 > ---------------- --------------------------- --------------------------- > 350993 -15.6% 296081 ± 2% -1.5% 345689 > stream.add_bandwidth_MBps > 349830 -16.1% 293492 ± 2% -2.3% 341860 ± > 2% stream.add_bandwidth_MBps_harmonicMean > 333973 -20.5% 265439 ± 3% -1.7% 328403 > stream.copy_bandwidth_MBps > 332930 -21.7% 260548 ± 3% -2.5% 324711 ± > 2% stream.copy_bandwidth_MBps_harmonicMean > 302788 -16.2% 253817 ± 2% -1.4% 298421 > stream.scale_bandwidth_MBps > 302157 -17.1% 250577 ± 2% -2.0% 296054 > stream.scale_bandwidth_MBps_harmonicMean > 339047 -12.1% 298061 -1.4% 334206 > stream.triad_bandwidth_MBps > 338186 -12.4% 296218 -2.0% 331469 > stream.triad_bandwidth_MBps_harmonicMean > > > The regression of ramspeed is still there. Thanks for the debugging patch and the test. If no one has objection to honor MAP_STACK, I'm going to come up with a more formal patch. Even though thp_get_unmapped_area() is not called for MAP_STACK, stack area still may be allocated at 2M aligned address theoretically. And it may be worse with multi-sized THP, for 1M. Do you have any instructions regarding how to run ramspeed? Anyway I may not have time debug it until after holidays. > > > Regards > Yin, Fengwei