On Mon, Oct 28, 2024 at 7:34 PM Maíra Canal <mcanal@xxxxxxxxxx> wrote: > > Hi Barry, > > On 28/10/24 08:09, Barry Song wrote: > > On Mon, Oct 28, 2024 at 6:10 PM Maíra Canal <mcanal@xxxxxxxxxx> wrote: > >> > >> Hi Barry, > >> > >> On 27/10/24 18:54, Barry Song wrote: > >>> On Mon, Oct 28, 2024 at 6:58 AM Maíra Canal <mcanal@xxxxxxxxxx> wrote: > >>>> > >>>> Add the ``thp_shmem=`` kernel command line to allow specifying the > >>>> default policy of each supported shmem hugepage size. The kernel parameter > >>>> accepts the following format: > >>>> > >>>> thp_shmem=<size>[KMG],<size>[KMG]:<policy>;<size>[KMG]-<size>[KMG]:<policy> > >>>> > >>>> For example, > >>>> > >>>> thp_shmem=16K-64K:always;128K,512K:inherit;256K:advise;1M-2M:never;4M-8M:within_size > >>>> > >>>> By configuring the default policy of several shmem huge pages, the user > >>>> can take advantage of mTHP before it's been configured through sysfs. > >>>> > >>>> Signed-off-by: Maíra Canal <mcanal@xxxxxxxxxx> > >>>> --- > >>>> .../admin-guide/kernel-parameters.txt | 10 ++ > >>>> Documentation/admin-guide/mm/transhuge.rst | 17 +++ > >>>> mm/shmem.c | 109 +++++++++++++++++- > >>>> 3 files changed, 135 insertions(+), 1 deletion(-) > >>>> > >>> > >>> Hi Maíra, > >>> > >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > >>>> index acabb04d0dd4..595fa096e28b 100644 > >>>> --- a/Documentation/admin-guide/kernel-parameters.txt > >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt > >>>> @@ -6700,6 +6700,16 @@ > >>>> Force threading of all interrupt handlers except those > >>>> marked explicitly IRQF_NO_THREAD. > >>>> > >>>> + shmem_anon= [KNL] > >>>> + Format: <size>[KMG],<size>[KMG]:<policy>;<size>[KMG]-<size>[KMG]:<policy> > >>>> + Control the default policy of each hugepage size for the > >>>> + internal shmem mount. <policy> is one of policies available > >>>> + for the shmem mount ("always", "inherit", "never", "within_size", > >>>> + and "advise"). > >>>> + It can be used multiple times for multiple shmem THP sizes. > >>>> + See Documentation/admin-guide/mm/transhuge.rst for more > >>>> + details. > >>> > >>> I'm not sure this is the right name. How about "thp_shmem"? > >> > >> Oops, sorry about that. > >> > >>> > >>>> + > >>>> topology= [S390,EARLY] > >>>> Format: {off | on} > >>>> Specify if the kernel should make use of the cpu > >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > >>>> index 9b5b02c4d1ab..47e7fc30e22d 100644 > >>>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>>> @@ -332,6 +332,23 @@ allocation policy for the internal shmem mount by using the kernel parameter > >>>> seven valid policies for shmem (``always``, ``within_size``, ``advise``, > >>>> ``never``, ``deny``, and ``force``). > >>>> > >>>> +In the same manner as ``thp_anon`` controls each supported anonymous THP > >>>> +size, ``thp_shmem`` controls each supported shmem THP size. ``thp_shmem`` > >>>> +has the same format as ``thp_anon``, but also supports the policy > >>>> +``within_size``. > >>>> + > >>>> +``thp_shmem=`` may be specified multiple times to configure all THP sizes > >>>> +as required. If ``thp_shmem=`` is specified at least once, any shmem THP > >>>> +sizes not explicitly configured on the command line are implicitly set to > >>>> +``never``. > >>>> + > >>>> +``transparent_hugepage_shmem`` setting only affects the global toggle. If > >>>> +``thp_shmem`` is not specified, PMD_ORDER hugepage will default to > >>>> +``inherit``. However, if a valid ``thp_shmem`` setting is provided by the > >>>> +user, the PMD_ORDER hugepage policy will be overridden. If the policy for > >>>> +PMD_ORDER is not defined within a valid ``thp_shmem``, its policy will > >>>> +default to ``never``. > >>>> + > >>>> Hugepages in tmpfs/shmem > >>>> ======================== > >>>> > >>>> diff --git a/mm/shmem.c b/mm/shmem.c > >>>> index 24cdeafd8260..0a7a7d04f725 100644 > >>>> --- a/mm/shmem.c > >>>> +++ b/mm/shmem.c > > [...] > > >>>> static int __init setup_transparent_hugepage_shmem(char *str) > >>>> { > >>>> int huge, ret = 0; > >>>> @@ -5206,6 +5228,91 @@ static int __init setup_transparent_hugepage_shmem(char *str) > >>>> } > >>>> __setup("transparent_hugepage_shmem=", setup_transparent_hugepage_shmem); > >>>> > >>>> +static char str_dup[PAGE_SIZE] __initdata; > >>>> +static int __init setup_thp_shmem(char *str) > >>>> +{ > >>>> + char *token, *range, *policy, *subtoken; > >>>> + unsigned long always, inherit, madvise, within_size; > >>>> + char *start_size, *end_size; > >>>> + int start, end, nr; > >>>> + char *p; > >>>> + > >>>> + if (!str || strlen(str) + 1 > PAGE_SIZE) > >>>> + goto err; > >>>> + strcpy(str_dup, str); > >>>> + > >>>> + always = huge_shmem_orders_always; > >>>> + inherit = huge_shmem_orders_inherit; > >>>> + madvise = huge_shmem_orders_madvise; > >>>> + within_size = huge_shmem_orders_within_size; > >>>> + p = str_dup; > >>>> + while ((token = strsep(&p, ";")) != NULL) { > >>>> + range = strsep(&token, ":"); > >>>> + policy = token; > >>>> + > >>>> + if (!policy) > >>>> + goto err; > >>>> + > >>>> + while ((subtoken = strsep(&range, ",")) != NULL) { > >>>> + if (strchr(subtoken, '-')) { > >>>> + start_size = strsep(&subtoken, "-"); > >>>> + end_size = subtoken; > >>>> + > >>>> + start = get_order_from_str(start_size); > >>>> + end = get_order_from_str(end_size); > >>>> + } else { > >>>> + start = end = get_order_from_str(subtoken); > >>>> + } > >>>> + > >>>> + if (start < 0 || end < 0 || start > end) > >>>> + goto err; > >>>> + > >>>> + nr = end - start + 1; > >>>> + if (!strcmp(policy, "always")) { > >>>> + bitmap_set(&always, start, nr); > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&within_size, start, nr); > >>>> + } else if (!strcmp(policy, "advise")) { > >>>> + bitmap_set(&madvise, start, nr); > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + bitmap_clear(&within_size, start, nr); > >>>> + } else if (!strcmp(policy, "inherit")) { > >>>> + bitmap_set(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + bitmap_clear(&within_size, start, nr); > >>>> + } else if (!strcmp(policy, "within_size")) { > >>>> + bitmap_set(&within_size, start, nr); > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + } else if (!strcmp(policy, "never")) { > >>>> + bitmap_clear(&inherit, start, nr); > >>>> + bitmap_clear(&madvise, start, nr); > >>>> + bitmap_clear(&always, start, nr); > >>>> + bitmap_clear(&within_size, start, nr); > >>>> + } else { > >>>> + pr_err("invalid policy %s in thp_shmem boot parameter\n", policy); > >>>> + goto err; > >>>> + } > >>>> + } > >>>> + } > >>>> + > >>>> + huge_shmem_orders_always = always; > >>>> + huge_shmem_orders_madvise = madvise; > >>>> + huge_shmem_orders_inherit = inherit; > >>>> + huge_shmem_orders_within_size = within_size; > >>>> + shmem_orders_configured = true; > >>>> + return 1; > >>>> + > >>>> +err: > >>>> + pr_warn("thp_shmem=%s: error parsing string, ignoring setting\n", str); > >>>> + return 0; > >>>> +} > >>> > >>> Can we share source code with thp_anon since there's a lot of duplication? > >> > >> I'm not a regular mm contributor and I'm most usually around drivers, so > >> I don't know exactly here I could add shared code. Should I add the > >> headers to "internal.h"? > > > > My comment isn't related to drivers or memory management. It's solely about > > avoiding code duplication. For example, we could create a shared function to > > handle both controls, reducing redundant code :-) > > Let me rephrase it. > > I completely agree that we should avoid code duplication. I'm asking > where is the best place to add the headers of the shared functions. > "linux/shmem_fs.h" doesn't look appropriate to me, so I believe the > remaining options would be "linux/huge_mm.h" or "internal.h". Both locations seem quite odd. I have a feeling that these boot command elements are purely internal, yet internal.h contains something that is actually 'external' to mm. The shared code isn't 'external' enough to belong in internal.h. I didn't realize that shmem has placed these controls in its own file; I thought they were also located in mm/huge_memory.c. Given the current situation, I would prefer to keep the code as it is and tolerate the code duplication. Unless we are going to place controls for shmem and other thp controls in one place, I feel your code is better than having a shared function either in internal.h or linux/huge_mm.h. > > I would like to know your opinion about those two options. > > Best Regards, > - Maíra > > > > >> > >> Best Regards, > >> - Maíra > >> > >>> > >>>> +__setup("thp_shmem=", setup_thp_shmem); > >>>> + > >>>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > >>>> > >>>> #else /* !CONFIG_SHMEM */ > >>>> -- > >>>> 2.46.2 > >>>> > >>> Thanks barry