On Mon, 18 Dec 2023 21:17:34 +0100 "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote: > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@xxxxxxxxxxxxxxx> wrote: > > > > From: James Morse <james.morse@xxxxxxx> Done some digging + machine faking. This is mid stage results at best. Summary: I don't think this patch is necessary. If anyone happens to be in the mood for testing on various platforms, can you drop this patch and see if everything still works. With this patch in place, and a processor container containing Processor() objects acpi_process_add is called twice - once via the path added here and once via acpi_bus_attach etc. Maybe it's a left over from earlier approaches to some of this? > > > > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5, > > 5.2.12: > > > > "Starting with ACPI Specification 6.3, the use of the Processor() object > > was deprecated. Only legacy systems should continue with this usage. On > > the Itanium architecture only, a _UID is provided for the Processor() > > that is a string object. This usage of _UID is also deprecated since it > > can preclude an OSPM from being able to match a processor to a > > non-enumerable device, such as those defined in the MADT. From ACPI > > Specification 6.3 onward, all processor objects for all architectures > > except Itanium must now use Device() objects with an _HID of ACPI0007, > > and use only integer _UID values." Well, we definitely don't care about Itanium any more so most of this is irrelevant and can be scrubbed going forwards! Otherwise I think we only care about Device() and Processor() being two things that might be seen to describe CPUs and they may or may not be in a Processor container. > > > > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors > > > > Duplicate descriptions are not allowed, the ACPI processor driver already > > parses the UID from both devices and containers. acpi_processor_get_info() > > returns an error if the UID exists twice in the DSDT. > > I'm not really sure how the above is related to the actual patch. This is nasty. They key is that with this patch in place, we are actually adding them twice if they are are instantiated via Processor() in a processor container. So this reference is explaining why we don't get two lots registered. This patch should call out explicitly why we want to do it twice (I'm assuming on a temporary baseis). > > > The missing probe for CPUs described as packages > > It is unclear what exactly is meant by "CPUs described as packages". > > From the patch, it looks like those would be Processor() objects > defined under a processor container device. Agreed. > > > creates a problem for > > moving the cpu_register() calls into the acpi_processor driver, as CPUs > > described like this don't get registered, leading to errors from other > > subsystems when they try to add new sysfs entries to the CPU node. > > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp) > > > > To fix this, parse the processor container and call acpi_processor_add() > > for each processor that is discovered like this. > > Discovered like what? Doesn't add any info. "To fix this, parse the processor container and call acpi_processor_add() for each processor found." > > > The processor container > > handler is added with acpi_scan_add_handler(), so no detach call will > > arrive. > > The above requires clarification too. > > > Qemu TCG describes CPUs using processor devices in a processor container. Hmm. This isn't so clear cut. For ARM it does it nicely with ACPI0007 etc. For x86 it is still Processor() under some circumstances... (why exactly doesn't matter here - it's all legacy mess). To poke this I hacked the arm virt qemu platform to use Processor() in a container so I could like for like comparisons. The logic that injects a HID into Processor() objects means the existing handlers get fired without this patch. I'm going to assume that might not be the case later in this patch set, but I've not found where it is broken yet :( > > For more information, see build_cpus_aml() in Qemu hw/acpi/cpu.c and > > https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device > > > > Signed-off-by: James Morse <james.morse@xxxxxxx> > > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx> > > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx> > > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx> > > --- > > Outstanding comments: > > https://lore.kernel.org/r/20230914145353.000072e2@xxxxxxxxxx > > https://lore.kernel.org/r/50571c2f-aa3c-baeb-3add-cd59e0eddc02@xxxxxxxxxx > > --- > > drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > > index 4fe2ef54088c..6a542e0ce396 100644 > > --- a/drivers/acpi/acpi_processor.c > > +++ b/drivers/acpi/acpi_processor.c > > @@ -626,9 +626,31 @@ static struct acpi_scan_handler processor_handler = { > > }, > > }; > > > > +static acpi_status acpi_processor_container_walk(acpi_handle handle, > > + u32 lvl, > > + void *context, > > + void **rv) > > +{ > > + struct acpi_device *adev; > > + acpi_status status; > > + > > + adev = acpi_get_acpi_dev(handle); > > + if (!adev) > > + return AE_ERROR; > > Why is the reference counting needed here? > > Wouldn't acpi_fetch_acpi_dev() suffice? You are the expert here :) I can't see why the reference is needed so would be fine with dropping it. > > Also, should the walk really be terminated on the first error? If this patch makes sense things will probably blow up later but no worse than before so sure, keep going. > > > + > > + status = acpi_processor_add(adev, &processor_device_ids[0]); > > + acpi_put_acpi_dev(adev); > > + > > + return status; > > +} > > + > > static int acpi_processor_container_attach(struct acpi_device *dev, > > const struct acpi_device_id *id) > > { > > + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle, > > + ACPI_UINT32_MAX, acpi_processor_container_walk, > > + NULL, NULL, NULL); > > This covers processor objects only, so why is this not needed for > processor devices defined under a processor container object? Both cases are covered by the existing handling without this. I'm far from clear on why we need this patch. Presumably it's the reference in the description on it breaking for Processor Package containing Processor() objects that matters after a move... I'm struggling to find that move though! > > It is not obvious, so it would be nice to add a comment explaining the > difference. > > > + > > return 1; > > } > > > > -- > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel