On 12.08.24 07:36, Barry Song wrote:
On Sat, Aug 10, 2024 at 1:15 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
-You can change the sysfs boot time defaults of Transparent Hugepage
-Support by passing the parameter ``transparent_hugepage=always`` or
-``transparent_hugepage=madvise`` or ``transparent_hugepage=never``
-to the kernel command line.
+You can change the sysfs boot time default for the top-level "enabled"
+control by passing the parameter ``transparent_hugepage=always`` or
+``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the
+kernel command line.
+
+Alternatively, each supported anonymous THP size can be controlled by
+passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size
+and ``<state>`` is one of ``always``, ``madvise``, ``never`` or
+``inherit``.
+
+For example, the following will set 64K THP to ``always``::
+
+ thp_anon=64K:always
+
+``thp_anon=`` may be specified multiple times to configure all THP sizes as
+required. If ``thp_anon=`` is specified at least once, any anon THP sizes
+not explicitly configured on the command line are implicitly set to
+``never``.
I suggest documenting that "thp_anon=" will not effect the value of
"transparent_hugepage=", or any configured default.
Did you see the previous conversation with Barry about whether or not to honour
configured defaults when any thp_anon= is provided [1]? Sounds like you also
think we should honour the PMD "inherit" default if not explicitly provided on
the command line? (see link for justification for the approach I'm currently
taking).
I primarily think that we should document it :)
What if someone passes "transparent_hugepage=always" and "thp_anon=..."?
I would assume that transparent_hugepage would only affect the global
toggle then?
[1]
https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@xxxxxxxxxxxxxx/
Wondering if a syntax like
thp_anon=16K,32K,64K:always;1048K,2048K:madvise
Are there examples of that syntax already or have you just made it up? I found
examples with the colon (:) but nothing this fancy. I guess that's not a reason
not to do it though (other than the risk of screwing up the parser in a subtle way).
I made it up -- mostly ;) I think we are quite flexible on what we can
do. As always, maybe we can keep it bit consistent with existing stuff.
For hugetlb_cma we have things like
"<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]]
"memmap=" options are more ... advanced, including memory ranges. There
are a bunch more documented in kernel-parameters.txt that have more
elaborate formats.
Ranges would probably be the most valuable addition. So maybe we should
start with:
thp_anon=16K-64K:always,1048K-2048K:madvise
So to enable all THPs it would simply be
thp_anon=16K-2M:always
Interesting question what would happen if someone passes:
thp_anon=8K-2M:always
Likely we simply would apply it to any size in the range, even if
start/end is not a THP size.
But we would want to complain to the user if someone only specifies a
single one (or a range does not even select a single one) that does not
exist:
thp_anon=8K:always
How about rejecting settings with any illegal size or policy?
I found there are too many corner cases to say "yes" or "no". It seems
the code could be much cleaner if we simply reject illegal settings.
On the other hand, we should rely on smart users to provide correct
settings, shouldn’t we? While working one the code, I felt that
extracting partial correct settings from incorrect ones and supporting
them might be a bit of over-design.
No strong opinion on failing configs. Ignoring non-existing sizes might
lead to more portable cmdlines between kernel versions.
I have tested the below code by
"thp_anon=16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never"
There are some parameters that separate individual options by ";"
already (config_acs). Most parameters use ",". No idea what's better
here, and which separator to use for sizes instead when using "," for
options. No strong opinion.
--
Cheers,
David / dhildenb