On Mon, Aug 12, 2024 at 8:20 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 12.08.24 07:36, Barry Song wrote: > > On Sat, Aug 10, 2024 at 1:15 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >>>>>>> -You can change the sysfs boot time defaults of Transparent Hugepage > >>>>>>> -Support by passing the parameter ``transparent_hugepage=always`` or > >>>>>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > >>>>>>> -to the kernel command line. > >>>>>>> +You can change the sysfs boot time default for the top-level "enabled" > >>>>>>> +control by passing the parameter ``transparent_hugepage=always`` or > >>>>>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > >>>>>>> +kernel command line. > >>>>>>> + > >>>>>>> +Alternatively, each supported anonymous THP size can be controlled by > >>>>>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > >>>>>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > >>>>>>> +``inherit``. > >>>>>>> + > >>>>>>> +For example, the following will set 64K THP to ``always``:: > >>>>>>> + > >>>>>>> + thp_anon=64K:always > >>>>>>> + > >>>>>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as > >>>>>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > >>>>>>> +not explicitly configured on the command line are implicitly set to > >>>>>>> +``never``. > >>>>>> > >>>>>> I suggest documenting that "thp_anon=" will not effect the value of > >>>>>> "transparent_hugepage=", or any configured default. > >>> > >>> Did you see the previous conversation with Barry about whether or not to honour > >>> configured defaults when any thp_anon= is provided [1]? Sounds like you also > >>> think we should honour the PMD "inherit" default if not explicitly provided on > >>> the command line? (see link for justification for the approach I'm currently > >>> taking). > >> > >> I primarily think that we should document it :) > >> > >> What if someone passes "transparent_hugepage=always" and "thp_anon=..."? > >> I would assume that transparent_hugepage would only affect the global > >> toggle then? > >> > >>> > >>> [1] > >>> https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@xxxxxxxxxxxxxx/ > >>> > >>>>>> > >>>>>> Wondering if a syntax like > >>>>>> > >>>>>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise > >>> > >>> Are there examples of that syntax already or have you just made it up? I found > >>> examples with the colon (:) but nothing this fancy. I guess that's not a reason > >>> not to do it though (other than the risk of screwing up the parser in a subtle way). > >> > >> I made it up -- mostly ;) I think we are quite flexible on what we can > >> do. As always, maybe we can keep it bit consistent with existing stuff. > >> > >> For hugetlb_cma we have things like > >> "<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]] > >> > >> "memmap=" options are more ... advanced, including memory ranges. There > >> are a bunch more documented in kernel-parameters.txt that have more > >> elaborate formats. > >> > >> Ranges would probably be the most valuable addition. So maybe we should > >> start with: > >> > >> thp_anon=16K-64K:always,1048K-2048K:madvise > >> > >> So to enable all THPs it would simply be > >> > >> thp_anon=16K-2M:always > >> > >> Interesting question what would happen if someone passes: > >> > >> thp_anon=8K-2M:always > >> > >> Likely we simply would apply it to any size in the range, even if > >> start/end is not a THP size. > >> > >> But we would want to complain to the user if someone only specifies a > >> single one (or a range does not even select a single one) that does not > >> exist: > >> > >> thp_anon=8K:always > > > > How about rejecting settings with any illegal size or policy? > > > > I found there are too many corner cases to say "yes" or "no". It seems > > the code could be much cleaner if we simply reject illegal settings. > > On the other hand, we should rely on smart users to provide correct > > settings, shouldn’t we? While working one the code, I felt that > > extracting partial correct settings from incorrect ones and supporting > > them might be a bit of over-design. > > No strong opinion on failing configs. Ignoring non-existing sizes might > lead to more portable cmdlines between kernel versions. Well, I realized that the code I posted has actually applied the correct parts because it modifies the global variable huge_anon_orders_xxx. Unless I use temporary variables and then copy the results to global ones, the code has been this issue to some extent. maybe I should remove the below "goto err", instead, ignore the incorrect strings and go to the next strings to parse. static int __init setup_thp_anon(char *str) { ... if (!strcmp(policy, "always")) { set_bits(&huge_anon_orders_always, start, end); clear_bits(&huge_anon_orders_inherit, start, end); clear_bits(&huge_anon_orders_madvise, start, end); } else if (!strcmp(policy, "madvise")) { ... } else if (!strcmp(policy, "inherit")) { ... } else if (!strcmp(policy, "never")) { ... } else { /* goto err; */ -> pr_err } } } ... err: pr_warn("thp_anon=%s: cannot parse(invalid size or policy), ignored\n", str); return 0; } > > > > > I have tested the below code by > > > > "thp_anon=16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never" > > There are some parameters that separate individual options by ";" > already (config_acs). Most parameters use ",". No idea what's better > here, and which separator to use for sizes instead when using "," for > options. No strong opinion. But we have used "," for the purpose "128K,512K:inherit". so here we have to use ";" for "16K-64K:always;128K,512K:inherit" > > > -- > Cheers, > > David / dhildenb Thanks Barry