Hi Mathias, On 11/16/21 3:32 AM, Matthias Schiffer wrote: > On Mon, 2021-11-15 at 12:23 -0500, Frank Rowand wrote: >> Hi Matthias, >> >> On 11/15/21 3:13 AM, Matthias Schiffer wrote: >>> On Sun, 2021-11-14 at 14:41 -0500, Frank Rowand wrote: >>>> On 11/8/21 3:48 AM, Matthias Schiffer wrote: It turns out I confused myself a bit... The first email provided the clue that I needed: >>>>> Allow fully disabling CPU nodes using status = "fail". Having no status >>>>> property at all is still interpreted as "okay" as usual. I managed to forget that sentence before I looked at the code in the patch. >>>>> >>>>> This allows a bootloader to change the number of available CPUs (for >>>>> example when a common DTS is used for SoC variants with different numbers >>>>> of cores) without deleting the nodes altogether, which could require >>>>> additional fixups to avoid dangling phandle references. >>>>> >>>>> References: >>>>> - https://www.lkml.org/lkml/2020/8/26/1237 >>>>> - https://www.spinics.net/lists/devicetree-spec/msg01007.html >>>>> - https://github.com/devicetree-org/dt-schema/pull/61 >>>>> >>>>> Signed-off-by: Matthias Schiffer <matthias.schiffer@xxxxxxxxxxxxxxx> >>>>> --- >>>>> drivers/of/base.c | 29 +++++++++++++++++++++++++++++ >>>>> 1 file changed, 29 insertions(+) >>>>> >>>>> diff --git a/drivers/of/base.c b/drivers/of/base.c >>>>> index 61de453b885c..4e9973627c8d 100644 >>>>> --- a/drivers/of/base.c >>>>> +++ b/drivers/of/base.c >>>>> @@ -650,6 +650,32 @@ bool of_device_is_available(const struct device_node *device) >>>>> } >>>>> EXPORT_SYMBOL(of_device_is_available); >>>>> >>>>> +/** >>>>> + * __of_device_is_disabled - check if a device has status "disabled" >>>>> + * >>>>> + * @device: Node to check status for, with locks already held >>>>> + * >>>>> + * Return: True if the status property is set to "disabled", >>>>> + * false otherwise >>>>> + * >>>>> + * Most callers should use __of_device_is_available() instead, this function >>>>> + * only exists due to the special interpretation of the "disabled" status for >>>>> + * CPU nodes. >>>>> + */ >>>>> +static bool __of_device_is_disabled(const struct device_node *device) >>>>> +{ >>>>> + const char *status; >>>>> + >>>>> + if (!device) >>>>> + return false; >>>>> + >>>>> + status = __of_get_property(device, "status", NULL); >>>>> + if (status == NULL) >>>>> + return false; >>>>> + >>>>> + return !strcmp(status, "disabled"); >>>>> +} >>>>> + >>>>> /** >>>>> * of_device_is_big_endian - check if a device has BE registers >>>>> * >>>>> @@ -817,6 +843,9 @@ struct device_node *of_get_next_cpu_node(struct device_node *prev) >>>>> of_node_put(node); >>>>> } So in the following test: >>>>> for (; next; next = next->sibling) { >>>>> + if (!__of_device_is_available(next) && >>>>> + !__of_device_is_disabled(next)) (!__of_device_is_available(next) && !__of_device_is_disabled(next)) is meant to detect a status of "fail" or "fail-sss". Since I forget the sentence about "fail" from early in the email, I had difficulty in interpreting the intent of the (!ok && !disabled) form of the test. The intent of the test would be more clear if it was checking _for_ "fail" instead of checking for _not_ the other possible status values. So you _are_ checking for the status of "fail" check (Rob's choice #1 below) and I just did not understand that was the intent of the patch. So I am fine with the patch if you change the above logic to check for "fail" or "fail-sss". It would also be good to add a comment to the of_get_next_cpu_node() header comment that "fail" nodes are excluded. Sorry for the confusion. -Frank >>>> >>>> Shouldn't that just be a check to continue if the device is disabled? >>>> >>>> If adding a check for available, then all of the callers of for_each_of_cpu_node() >>>> need to be checked. There is at least one that is suspicious - arch/arm/mach-imx/platsmp.c >>>> has a comment: >>>> >>>> * Initialise the CPU possible map early - this describes the CPUs >>>> * which may be present or become present in the system. >> >> Thanks for the links to previous discussion you provided below. I had >> forgotten the previous discussion. >> >> In [2] Rob ended up saying that there were two options he was fine with. >> Either (or both), in of_get_next_cpu_node(), >> >> (1) use status of "fail" as the check or >> >> (2) use status of "disabled" as the check, conditional on !IS_ENABLED(CONFIG_PPC) >> "this would need some spec update" >> "Need to double check MIPS and Sparc though." >> >> Neither of these two options are what this patch does. It seems to me that >> option 1 is probably the easiest and least intrusive method. > > My intuition is that a device with an unknown status value should not > be used. For most devices this is already handled by treating any value > that is not unset, "okay" or "ok" the same. For CPU nodes, this would > be the case by treating such values like "fail". > > I did a quick grep through the in-tree Device Trees, and I did find a > few unusual status properties (none of them in CPU nodes though): > > - Typo "failed" (4 cases) > - Typo "disable" (17 cases) > - "reserved" (19 cases) > > "fail" appears 2 times, the rest is "okay", "ok" or "disabled". > > I do not have a strong opinion on this though - for our concrete > usecase, checking for "fail" is fine, and we can treat unknown values > like "disabled" if you prefer that solution. Should "fail-*" status > values also be treated like "fail" then? > > > > >> >> -Frank >> >>> Previously, there were two option for the (effective) value of the >>> status of a device_node: >>> >>> - "okay", "ok" or unset >>> - anything else (which includes "disabled" and "fail") >>> >>> __of_device_is_available() checks which of these two is the case. >>> >>> With the new code, we have 3 cases for the status of CPU nodes: >>> >>> - "okay", "ok" or unset >>> - "disabled" >>> - anything else ("fail", ...) >>> >>> My patch will only change the behaviour in one case: When a CPU node >>> has a status that is not "okay", "ok", "disabled" or unset - which is >>> exactly the point of my patch. >>> >>> See also the change [1], which removed the !available check a while >>> ago, and the discussion in [2], which led us to the conclusion that >>> of_get_next_cpu_node() must not distinguish "okay" and "disabled" CPU >>> nodes and we instead need a third status to disable a CPU for real. >>> >>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/of/base.c?id=c961cb3be9064d1097ccc019390f8b5739daafc6 >>> [2] https://www.lkml.org/lkml/2020/8/26/1237 >>> >>> >>>> >>>> -Frank >>>> >>>>> + continue; >>>>> if (!(of_node_name_eq(next, "cpu") || >>>>> __of_node_is_type(next, "cpu"))) >>>>> continue; >>>>> >>>> >>>> >> >> >