On Mon, Apr 15, 2024 at 12:52 PM Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > > On Fri, 12 Apr 2024 20:30:40 +0200 > "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote: > > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron > > <Jonathan.Cameron@xxxxxxxxxx> wrote: > > > > > > From: James Morse <james.morse@xxxxxxx> > > > > > > The arm64 specific arch_register_cpu() call may defer CPU registration > > > until the ACPI interpreter is available and the _STA method can > > > be evaluated. > > > > > > If this occurs, then a second attempt is made in > > > acpi_processor_get_info(). Note that the arm64 specific call has > > > not yet been added so for now this will never be successfully > > > called. > > > > > > Systems can still be booted with 'acpi=off', or not include an > > > ACPI description at all as in these cases arch_register_cpu() > > > will not have deferred registration when first called. > > > > > > This moves the CPU register logic back to a subsys_initcall(), > > > while the memory nodes will have been registered earlier. > > > Note this is where the call was prior to the cleanup series so > > > there should be no side effects of moving it back again for this > > > specific case. > > > > > > [PATCH 00/21] Initial cleanups for vCPU HP. > > > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@xxxxxxxxxxxxxxxxxxxxx/ > > > > > > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES") > > > > > > Signed-off-by: James Morse <james.morse@xxxxxxx> > > > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx> > > > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx> > > > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx> > > > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx> > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx> > > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > > > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@xxxxxxxxxx> > > > --- > > > v5: Update commit message to make it clear this is moving the > > > init back to where it was until very recently. > > > > > > No longer change the condition in the earlier registration point > > > as that will be handled by the arm64 registration routine > > > deferring until called again here. > > > --- > > > drivers/acpi/acpi_processor.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > > > index 93e029403d05..c78398cdd060 100644 > > > --- a/drivers/acpi/acpi_processor.c > > > +++ b/drivers/acpi/acpi_processor.c > > > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device) > > > > > > c = &per_cpu(cpu_devices, pr->id); > > > ACPI_COMPANION_SET(&c->dev, device); > > > + /* > > > + * Register CPUs that are present. get_cpu_device() is used to skip > > > + * duplicate CPU descriptions from firmware. > > > + */ > > > + if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) && > > > + !get_cpu_device(pr->id)) { > > > + int ret = arch_register_cpu(pr->id); > > > + > > > + if (ret) > > > + return ret; > > > + } > > > + > > > /* > > > * Extra Processor objects may be enumerated on MP systems with > > > * less than the max # of CPUs. They should be ignored _iff > > > -- > > > > I am still unsure why there need to be two paths calling > > arch_register_cpu() in acpi_processor_get_info(). > > I replied further down the thread, but the key point was to maintain > the strong distinction between 'what' was done in a real hotplug > path vs one where onlining was all. We can relax that but it goes > contrary to the careful dance that was needed to get any agreement > to the ARM architecture aspects of this. This seems to go beyond technical territory. As a general rule, we tend to combine code paths that look similar instead of making them separate on purpose. Especially with a little to no explanation of the technical reason. > > > > Just below the comment partially pulled into the patch context above, > > there is this code: > > > > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { > > int ret = acpi_processor_hotadd_init(pr); > > > > if (ret) > > return ret; > > } > > > > For the sake of the argument, fold acpi_processor_hotadd_init() into > > it and drop the redundant _STA check from it: > > If we combine these, the _STA check is necessary because we will call this > path for delayed onlining of ARM64 CPUs (if the earlier registration code > call or arch_register_cpu() returned -EPROBE defer). That's the only way > we know that a given CPU is online capable but firmware is saying we can't > bring it online yet (it may be be vHP later). Ignoring the above as per the other message. > > > > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) { > > if (invalid_phys_cpuid(pr->phys_id)) > > return -ENODEV; > > > > cpu_maps_update_begin(); > > cpus_write_lock(); > > > > ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id); > > I read that call as > acpi_map_cpu_for_physical_cpu_hotplug() > but we could make it equivalent of. > acpi_map_cpu_for_whatever_cpu_hotplug() Yes. > (I'm not proposing those names though ;) Sure. > in which case it is fine to just stub it out on ARM64. > > if (ret) { > > cpus_write_unlock(); > > cpu_maps_update_done(); > > return ret; > > } > > ret = arch_register_cpu(pr->id); > > if (ret) { > > acpi_unmap_cpu(pr->id); > > > > cpus_write_unlock(); > > cpu_maps_update_done(); > > return ret; > > } > > pr_info("CPU%d has been hot-added\n", pr->id); > > pr->flags.need_hotplug_init = 1; > This one needs more careful handling because we are calling this > for non hotplug cases on arm64 in which case we end up setting this > for initially online CPUs - thus if we offline and online them > again via sysfs /sys/bus/cpu/device/cpuX/online it goes through the > hotplug path and should not. > > So I need a way to detect if we are hotplugging the cpu or not. > Is there a standard way to do this? If you mean physical hot-add, I don't think so. What exactly do you need to do differently in the two cases? > I haven't figured out how to use flags in drivers to communicate this state. > > > > > cpus_write_unlock(); > > cpu_maps_update_done(); > > } > > > > so I'm not sure why this cannot be combined with the new code. > > > > Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls. > > What's the difference then? The locking, which should be fine if I'm > > not mistaken and need_hotplug_init that needs to be set if this code > > runs after the processor driver has loaded AFAICS. > > That's the bit that I'm currently finding a challenge. Is there a clean > way to detect that? If you mean the need_hotplug_init flag, I'm currently a bit struggling to convince myself that it is really needed.