On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle) <linux@xxxxxxxxxxxxxxx> wrote: > > On Thu, Dec 14, 2023 at 06:47:00PM +0100, Rafael J. Wysocki wrote: > > On Thu, Dec 14, 2023 at 6:32 PM Jonathan Cameron > > <Jonathan.Cameron@xxxxxxxxxx> wrote: > > > > > > On Wed, 13 Dec 2023 12:49:16 +0000 > > > Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx> wrote: > > > > > > > From: James Morse <james.morse@xxxxxxx> > > > > > > > > Today the ACPI enumeration code 'visits' all devices that are present. > > > > > > > > This is a problem for arm64, where CPUs are always present, but not > > > > always enabled. When a device-check occurs because the firmware-policy > > > > has changed and a CPU is now enabled, the following error occurs: > > > > | acpi ACPI0007:48: Enumeration failure > > > > > > > > This is ultimately because acpi_dev_ready_for_enumeration() returns > > > > true for a device that is not enabled. The ACPI Processor driver > > > > will not register such CPUs as they are not 'decoding their resources'. > > > > > > > > Change acpi_dev_ready_for_enumeration() to also check the enabled bit. > > > > ACPI allows a device to be functional instead of maintaining the > > > > present and enabled bit. Make this behaviour an explicit check with > > > > a reference to the spec, and then check the present and enabled bits. > > > > This is needed to avoid enumerating present && functional devices that > > > > are not enabled. > > > > > > > > Signed-off-by: James Morse <james.morse@xxxxxxx> > > > > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx> > > > > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx> > > > > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx> > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx> > > > > --- > > > > If this change causes problems on deployed hardware, I suggest an > > > > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes > > > > acpi_dev_ready_for_enumeration() to only check the present bit. > > > > > > My gut feeling (having made ACPI 'fixes' in the past that ran into > > > horribly broken firmware and had to be reverted) is reduce the blast > > > radius preemptively from the start. I'd love to live in a world were > > > that wasn't necessary but I don't trust all the generators of ACPI tables. > > > I'll leave it to Rafael and other ACPI experts suggest how narrow we should > > > make it though - arch opt in might be narrow enough. > > > > A chicken bit wouldn't help much IMO, especially in the cases when > > working setups get broken. > > > > I would very much prefer to limit the scope of it, say to processors > > only, in the first place. > > Thanks for the feedback and the idea. > > I guess we need something like: > > if (device->status.present) > return device->device_type != ACPI_BUS_TYPE_PROCESSOR || > device->status.enabled; > else > return device->status.functional; > > so we only check device->status.enabled for processor-type devices? Yes, something like this.