Hi Oliver, On Fri, Jul 19, 2024 at 10:06 AM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > On Fri, Jul 19, 2024 at 2:44 AM Oliver Sang <oliver.sang@xxxxxxxxx> wrote: > > > > hi, Yu Zhao, > > > > On Wed, Jul 17, 2024 at 09:44:33AM -0600, Yu Zhao wrote: > > > On Wed, Jul 17, 2024 at 2:36 AM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > > > > > > Hi Janosch and Oliver, > > > > > > > > On Wed, Jul 17, 2024 at 1:57 AM Janosch Frank <frankja@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On 7/9/24 07:11, kernel test robot wrote: > > > > > > Hello, > > > > > > > > > > > > kernel test robot noticed a -34.3% regression of vm-scalability.throughput on: > > > > > > > > > > > > > > > > > > commit: 875fa64577da9bc8e9963ee14fef8433f20653e7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") > > > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > > > > > > > > > > > [still regression on linux-next/master 0b58e108042b0ed28a71cd7edf5175999955b233] > > > > > > > > > > > This has hit s390 huge page backed KVM guests as well. > > > > > Our simple start/stop test case went from ~5 to over 50 seconds of runtime. > > > > > > > > Could you try the attached patch please? Thank you. > > > > > > Thanks, Yosry, for spotting the following typo: > > > flags &= VMEMMAP_SYNCHRONIZE_RCU; > > > It's supposed to be: > > > flags &= ~VMEMMAP_SYNCHRONIZE_RCU; > > > > > > Reattaching v2 with the above typo fixed. Please let me know, Janosch & Oliver. > > > > since the commit is in mainline now, I directly apply your v2 patch upon > > bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") > > > > in our tests, your v2 patch not only recovers the performance regression, > > Thanks for verifying the fix! > > > it even has +13.7% performance improvement than 5a4d8944d6b1e (parent of > > bd225530a4c71) > > Glad to hear! > > (The original patch improved and regressed the performance at the same > time, but the regression is bigger. The fix removed the regression and > surfaced the improvement.) Can you please run the benchmark again with the attached patch on top of the last fix? I spotted something else worth optimizing last time, and with the patch attached, I was able to measure some significant improvements in 1GB hugeTLB allocation and free time, e.g., when allocating and free 700 1GB hugeTLB pages: Before: # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages real 0m13.500s user 0m0.000s sys 0m13.311s # time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages real 0m11.269s user 0m0.000s sys 0m11.187s After: # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages real 0m10.643s user 0m0.001s sys 0m10.487s # time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages real 0m1.541s user 0m0.000s sys 0m1.528s Thanks!
Attachment:
hugetlb.patch
Description: Binary data