Re:Re: [PATCH v3 0/6] add mTHP support for anonymous shmem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




At 2024-05-31 18:13:03, "Baolin Wang" <baolin.wang@xxxxxxxxxxxxxxxxx> wrote:
>
>
>On 2024/5/31 17:35, David Hildenbrand wrote:
>> On 30.05.24 04:04, Baolin Wang wrote:
>>> Anonymous pages have already been supported for multi-size (mTHP) 
>>> allocation
>>> through commit 19eaf44954df, that can allow THP to be configured 
>>> through the
>>> sysfs interface located at 
>>> '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled'.
>>>
>>> However, the anonymous shmem will ignore the anonymous mTHP rule 
>>> configured
>>> through the sysfs interface, and can only use the PMD-mapped THP, that 
>>> is not
>>> reasonable. Many implement anonymous page sharing through 
>>> mmap(MAP_SHARED |
>>> MAP_ANONYMOUS), especially in database usage scenarios, therefore, 
>>> users expect
>>> to apply an unified mTHP strategy for anonymous pages, also including the
>>> anonymous shared pages, in order to enjoy the benefits of mTHP. For 
>>> example,
>>> lower latency than PMD-mapped THP, smaller memory bloat than 
>>> PMD-mapped THP,
>>> contiguous PTEs on ARM architecture to reduce TLB miss etc.
>>>
>>> The primary strategy is similar to supporting anonymous mTHP. Introduce
>>> a new interface '/mm/transparent_hugepage/hugepage-XXkb/shmem_enabled',
>>> which can have all the same values as the top-level
>>> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', with adding a new
>>> additional "inherit" option. By default all sizes will be set to "never"
>>> except PMD size, which is set to "inherit". This ensures backward 
>>> compatibility
>>> with the anonymous shmem enabled of the top level, meanwhile also allows
>>> independent control of anonymous shmem enabled for each mTHP.
>>>
>>> Use the page fault latency tool to measure the performance of 1G 
>>> anonymous shmem
>>> with 32 threads on my machine environment with: ARM64 Architecture, 32 
>>> cores,
>>> 125G memory:
>>> base: mm-unstable
>>> user-time    sys_time    faults_per_sec_per_cpu     faults_per_sec
>>> 0.04s        3.10s         83516.416                  2669684.890
>>>
>>> mm-unstable + patchset, anon shmem mTHP disabled
>>> user-time    sys_time    faults_per_sec_per_cpu     faults_per_sec
>>> 0.02s        3.14s         82936.359                  2630746.027
>>>
>>> mm-unstable + patchset, anon shmem 64K mTHP enabled
>>> user-time    sys_time    faults_per_sec_per_cpu     faults_per_sec
>>> 0.08s        0.31s         678630.231                 17082522.495
>>>
>>>  From the data above, it is observed that the patchset has a minimal 
>>> impact when
>>> mTHP is not enabled (some fluctuations observed during testing). When 
>>> enabling 64K
>>> mTHP, there is a significant improvement of the page fault latency.
>> 
>> Let me summarize the takeaway from the bi-weekly MM meeting as I 
>> understood it, that includes Hugh's feedback on per-block tracking vs. 
>
>Thanks David for the summarization.
>
>> mTHP:
>> 
>> (1) Per-block tracking
>> 
>> Per-block tracking is currently considered unwarranted complexity in 
>> shmem.c. We should try to get it done without that. For any test cases 
>> that fail, we should consider if they are actually valid for shmem.
>> 
>> To optimize FALLOC_FL_PUNCH_HOLE for the cases where splitting+freeing
>> is not possible at fallcoate() time, detecting zeropages later and
>> retrying to split+free might be an option, without per-block tracking.
>> 
>> (2) mTHP controls
>> 
>> As a default, we should not be using large folios / mTHP for any shmem, 
>> just like we did with THP via shmem_enabled. This is what this series 
>> currently does, and is aprt of the whole mTHP user-space interface design.
>> 
>> Further, the mTHP controls should control all of shmem, not only 
>> "anonymous shmem".
>
>Yes, that's what I thought and in my TODO list.
>
>> 
>> Also, we should properly fallback within the configured sizes, and not 
>> jump "over" configured sizes. Unless there is a good reason.
>> 
>> (3) khugepaged
>> 
>> khugepaged needs to handle larger folios properly as well. Until fixed, 
>> using smaller THP sizes as fallback might prohibit collapsing a 
>> PMD-sized THP later. But really, khugepaged needs to be fixed to handle 
>> that. >
>> (4) force/disable
>> 
>> These settings are rather testing artifacts from the old ages. We should 
>> not add them to the per-size toggles. We might "inherit" it from the 
>> global one, though.
>
>Sorry, I missed this. So I thould remove the 'force' and 'deny' option 
>for each mTHP, right?
>
I prefer to this. Perhaps the functionality of "force/deny" is different from
that of "always/never" when tmpfs is supported. The user needs to
understand the usage of "force" and "deny" again.


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux