On Wed, Apr 17, 2024 at 5:01 PM Salil Mehta <salil.mehta@xxxxxxxxxx> wrote: > > HI Rafael, > > > From: Rafael J. Wysocki <rafael@xxxxxxxxxx> > > Sent: Monday, April 15, 2024 5:39 PM > > > > On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@xxxxxxxxxx> wrote: > > > > > > > From: Rafael J. Wysocki <rafael@xxxxxxxxxx> > > > > Sent: Monday, April 15, 2024 1:51 PM > > > > > > > > On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta > > > > <salil.mehta@xxxxxxxxxx> > > > > wrote: > > > > > > > > > [cut] > > > > > > > > Though virtualization happily jumped on it to hot add/remove > > > > CPUs > > to/from a guest. > > > > > > > > > > > > There are limitations to this and we learned it the hard way > > > > on > > X86. At the end we came up with the following restrictions: > > > > > > > > > > > > 1) All possible CPUs have to be advertised at boot time via firmware > > > > > > (ACPI/DT/whatever) independent of them being present at boot time > > > > > > or not. > > > > > > > > > > > > That guarantees proper sizing and ensures that associations > > > > > > between hardware entities and software representations and the > > > > > > resulting topology are stable for the lifetime of a system. > > > > > > > > > > > > It is really required to know the full topology of the system at > > > > > > boot time especially with hybrid CPUs where some of the cores > > > > > > have hyperthreading and the others do not. > > > > > > > > > > > > > > > > > > 2) Hot add can only mark an already registered (possible) CPU > > > > > > present. Adding non-registered CPUs after boot is not possible. > > > > > > > > > > > > The CPU must have been registered in #1 already to ensure that > > > > > > the system topology does not suddenly change in an incompatible > > > > > > way at run-time. > > > > > > > > > > > > The same restriction would apply to real physical hotplug. I > > > > don't > > think that's any different for ARM64 or any other architecture. > > > > > > > > > > > > > > > There is a difference: > > > > > > > > > > 1. ARM arch does not allows for any processor to be NOT present. Hence, because of > > > > > this restriction any of its related per-cpu components must be > > > > present > and enumerated at the boot time as well (exposed by > > > > firmware and > ACPI). This means all the enumerated processors will > > > > be marked as > 'present' but they might exist in NOT enabled (_STA.enabled=0) state. > > > > > > > > > > There was one clear difference and please correct me if I'm wrong > > > > > here, for x86, the LAPIC associated with the x86 core can be brought online later even after boot? > > > > > > > > > > But for ARM Arch, processors and its corresponding per-cpu > > > > components > like redistributors all need to be present and > > > > enumerated during the > boot time. Redistributors are part of ALWAYS-ON power domain. > > > > > > > > OK > > > > > > > > So what exactly is the problem with this and what does > > > > acpi_processor_add() have to do with it? > > > > > > > > > For ARM Arch, during boot time, it should add processor as if no > > > hotplug exists. But later, in context to the (fake) hotplug trigger > > > from the virtualizer as a result of the CPU (un)plug action it should just > > end up in registering the already present CPU with the Linux Driver Model. > > > > So let me repeat this last time: acpi_processor_add() cannot do that, > > because (as defined today) it rejects CPUs with the "enabled" bit clear in _STA. > > > I understand that now because you have placed a check recently. sorry for stretching > it a bit but I wanted to clearly understand the reason for this behavior. Is it because, > > 1. It does not makes sense to add a disabled but present/functional processor or > perhaps there are repercussions to support such a behavior? Yes because it is unusable. > Or > > 2. None of the existing processors need such a behavior? > > > > > > > Do you want the per-CPU structures etc. to be created from the > > > > acpi_processor_add() path? > > > > > > > > > I referred to the components related to ARM CPU Arch like redistributors etc. > > > which will get initialized in context to Arch specific _init code not > > > here. This i.e. acpi_processor_add() is arch agnostic code common to all architectures. > > > > > > [ A digression: You do have _weak functions which can be overridden to > > > arch specific handling like arch_(un)map_cpu() etc. but we can't use > > > those to defer initialize the CPU related components - ARM Arch > > > constraint!] > > > > Not right now, but they can be added I suppose. > > > > > > > > > > > > > This plain won't work because acpi_processor_add(), as defined in > > > > the mainline kernel today (and the Jonathan's patches don't change > > > > that AFAICS), returns an error for processor devices with the > > > > "enabled" bit clear in _STA (it returns an error if the "present" > > > > bit is clear too, but that's obvious), so it only gets to calling > > > > arch_register_cpu() if > > > > *both* "present" and "enabled" _STA bits are set for the given > > > > processor device. > > > > > > > > > If you are referring to the _STA check in the XX_hot_add_init() then > > > in the current kernel code it only checks for the > > > ACPI_STA_DEVICE_PRESENT flag and not the ACPI_STA_DEVICE_ENABLED flag(?). > > > > No, I am not. I'm referring to this code in 6.9-rc4: > > > > static int acpi_processor_add(struct acpi_device *device, > > const struct acpi_device_id *id) { > > struct acpi_processor *pr; > > struct device *dev; > > int result = 0; > > > > if (!acpi_device_is_enabled(device)) > > return -ENODEV; > > > Ahh, sorry, I missed this check as this has been added recently. Yes, now your > logic of having common legs makes more sense. > > > > > > ... > > } > > > > where acpi_device_is_enabled() is defined as follows: > > > > bool acpi_device_is_enabled(const struct acpi_device *adev) { > > return adev->status.present && adev->status.enabled; } > > > Got it. > > > [digression note:] > BTW, I'm wondering why we are checking adev->status.present > as having adev->status.enabled as set and adev->status.present > as unset would mean firmware has a BUG. If we really want to > check this then we should rather flag a warning on detection > of this condition? Adding a warning would be fine with me. > Either this: > bool acpi_device_is_enabled(const struct acpi_device *adev) { > > if (!acpi_device_is_present(adev)) { > if (adev->status.enabled) > pr_debug("Device [%s] status inconsistent: Enabled but not Present\n", > device->pnp.bus_id); > return false; > } > return true; > } > > Ideally this inconsistency should have been checked in acpi_bus_get_status() > and above function should have been just, Yes, it can be added there. It can even clear 'enabled' if 'present' is clear. > file: drivers/acpi/scan.c > bool acpi_device_is_enabled(const struct acpi_device *adev) { > return !!adev->status.enabled; } Sure. > file: drivers/acpi/bus.c > int acpi_bus_get_status(struct acpi_device *device) > { > [...] > > status = acpi_bus_get_status_handle(device->handle, &sta); > if (ACPI_FAILURE(status)) > return -ENODEV; > > acpi_set_device_status(device, sta); > > if (device->status.functional && !device->status.present) { > pr_debug("Device [%s] status [%08x]: functional but not present\n", > device->pnp.bus_id, (u32)sta); > } > > + if (device->status.enabled && !device->status.present) { > + pr_debug("BUG: Device [%s] status [%08x]: enabled but not present\n", > + device->pnp.bus_id, (u32)sta); > + /* any specific handling here? */ > + } > > pr_debug("Device [%s] status [%08x]\n", device->pnp.bus_id, (u32)sta); > return 0; > } > > > > > > The code being reviewed has changed > > > exactly that behavior for 2 legs i.e. make-present and make-enabled legs. > > > > I'm not sure what you mean here, but the code above means that > > acpi_processor_add) does not distinguish between CPU with the "present" > > bit clear (in which case the "enabled" bit must also be clear as per the spec) > > and CPUs with the "present" bit set and the "enabled" bit clear. These two > > cases are handled in the same way. > > > > > I'm open to further address your point clearly. > > > > I hope that the above is clear enough. > > > Yes, clear now. I missed the new check. > > > > > > > > > > > That, BTW, is why I keep saying that from the ACPI CPU enumeration > > > > code perspective, there is no difference between "present" and > > "enabled". > > > > > > > > > Agreed but there is still a subtle difference. Enumeration happens > > > once and for all the processors during the boot time. And as confirmed > > > by yourself and Thomas as well that even in x86 arch all the > > > processors will be discovered and their topology fixed during the boot > > > time which is effectively the same behavior as in the ARM Arch. But > > > ARM assumes those 'present' bits in the present masks to be set during > > > the boot time which is not like x86(?). Hence, 'present cpu' Bits > > > will always be equal to 'possible cpu' Bits. This is a constraint put by the > > ARM maintainers and looks unshakable. > > > > Yes, there are differences between architectures, but the ACPI code is (or > > at least should be) architecture-agnostic (as you said somewhere above). > > So why does this matter for the ACPI code? > > > It should not. There were few bits like overriding of arch_register_cpu() which > was not allowed by ARM folks in 2020 when I floated the first prototype. > > > > > > > 2. Agreed regarding the topology. Are you suggesting that we > > > > must > call arch_register_cpu() during boot time for all the 'present' CPUs? > > > > > Even if that's the case, we might still want to defer > > > > registration of > the cpu device (register_cpu() API) with the > > > > Linux device model. Later > is what we are doing to hide/unhide the > > > > CPUs from the user while STA.Enabled Bit is toggled due to CPU (un)plug action. > > > > > > > > There are two ways to approach this IMV, and both seem to be valid > > > > in principle. > > > > > > > > One would be to treat CPUs with the "enabled" bit clear as not > > > > present and create all of the requisite data structures for them > > > > when they become available (in analogy with the "real hot-add" case). > > > > > > > > > Right. This one is out-of-scope for ARM Arch as we cannot defer any > > > Arch specific sizing and initializations after boot i.e. when > > > processor_add() gets called again later as a trigger of CPU plug action at the Virtualizer. > > > > > > > > > > > > > > The alternative one is to create all of the requisite data > > > > structures for the CPUs that you find during boot, but register CPU > > > > devices for those having the "enabled" _STA bit set. > > > > > > > > > Correct. and we defer the registration for CPUs with online-capable > > > Bit set in the ACPI MADT/GICC data structure. These CPUs basically > > > form set of hot-pluggable CPUs on ARM. > > > > > > > > > > > > > > It looks like you have chosen the second approach, which is fine > > > > with me (although personally, I would rather choose the first one), > > > > but then your arch code needs to arrange for the requisite CPU data > > > > structures etc. to be set up before acpi_processor_add() gets > > > > called because, as per the above, the latter just rejects CPUs with the "enabled" _STA bit clear. > > > > > > Yes, correct. First one is not possible - at least for now and to have > > > that it will require lot of rework especially at GIC. But there are > > > many other arch components (like timers, PMUs, etc.) whose behavior > > > needs to be specified somewhere in the architecture as well. All these are closely coupled with the ARM CPU architecture. > > > (it's beyond this discussion and lets leave that to ARM folks). > > > > > > This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly. > > > > Well, I'm having a hard time with this. > > > > As far as CPU enumeration goes, if the "enabled" bit is clear in _STA, it does > > not happen at all. Both on ARM and on x86. > > sure, I can see that now. > > > > > Now tell me why there need to be two separate code paths calling > > arch_register_cpu() in acpi_processor_add()? > > > As mentioned above, in the first prototype I floated in the year 2020 any attempts > to override the __weak call of arch_register_cpu() for ARM64 was discouraged. > Though, the reasons might have changed now as some code has been moved. > > Once we are allowed to override the calls then there are many more possibilities > which open up to simplify the code further. Well, IMV this should just be an arch function with no __weak defaults, because the default would probably be unusable in practice anyway.