Hi Ard, > From: Ard Biesheuvel <ardb@xxxxxxxxxx> > Sent: Thursday, September 14, 2023 4:34 PM > To: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx> > Cc: James Morse <james.morse@xxxxxxx>; linux-pm@xxxxxxxxxxxxxxx; > loongarch@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux- > arch@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; linux-riscv@xxxxxxxxxxxxxxxxxxx; > kvmarm@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; Salil Mehta > <salil.mehta@xxxxxxxxxx>; Russell King <linux@xxxxxxxxxxxxxxx>; Jean- > Philippe Brucker <jean-philippe@xxxxxxxxxx>; jianyong.wu@xxxxxxx; > justin.he@xxxxxxx > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields > [code first?] > > On Thu, 14 Sept 2023 at 16:55, Jonathan Cameron > <Jonathan.Cameron@xxxxxxxxxx> wrote: > > > > On Thu, 14 Sep 2023 09:57:44 +0200 > > Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > Hello James, > > > > > > On Wed, 13 Sept 2023 at 18:41, James Morse <james.morse@xxxxxxx> wrote: > > > > > > > > Add the new flag field to the MADT's GICC structure. > > > > > > > > 'Online Capable' indicates a disabled CPU can be enabled later. > > > > > > > > > > Why do we need a bit for this? What would be the point of describing > > > disabled CPUs that cannot be enabled (and are you are aware of > > > firmware doing this?). > > > > Enabled being not set is common at some similar ACPI tables at least. > > > > This is available in most ACPI tables to allow firmware to use 'nearly' > > static tables and just tweak the 'enabled' bit to say if the record should > > be ignored or not. Also _STA not present which is for same trick. > > If you are doing clever dynamic tables, then you can just not present > > the entry. > > > > With that existing use case in mind, need another bit to say this > > one might one day turn up. Note this is copied from x86 though no > > one seems to have implemented the kernel support for them yet. > > > > Note as per my other reply - this isn't a code first proposal. It's in the > > spec already (via a code first proposal last year I think). > > > > > > > > So why are we not able to assume that this new bit can always be treated as '1'? > > > > Given above, need the extra bit to size stuff to allow for the CPU showing up > > late. > > > > So does this mean that on x86, the CPU object is instantiated only > when the hardware level hotplug occurs? And before that, the object > does not exist at all? That is correct but I am not sure if the presence of hardware Hotplug on x86 is even true. It all hidden behind firmware magic (I think). So x86 is able to use same infrastructure both for virtual and physical CPU Hotplug. >From the ACPI 6.3 > x86 have started to use online-capable bit for local x2apic in the MADT Table https://lore.kernel.org/lkml/168016878002.404.5262105401164408214.tip-bot2@tip-bot2/ https://lore.kernel.org/lkml/168016878085.404.6003734700616193238.tip-bot2@tip-bot2/ But there is a subtle difference in the way it is being used on x86 and on the ARM platform right now. On x86, during init, if the MADT entry for LAPIC is found to be online-capable and is enabled as well then possible and present cpumask gets set and a logical cpu-id is also allocated. If the MADT entry is online-capable but not enabled then disabled cpus are still counted but logical cpu-id is not allocated during init time and in fact setting present mask bits are also deferred till Hotplug happens later. static int acpi_register_lapic(int id, u32 acpiid, u8 enabled) { [...] if (!enabled) { /* Not ACPI_MADT_ENABLED */ ++disabled_cpus; return -EINVAL; } [...] cpu = generic_processor_info(id, ver); /* logical cupid, present mask*/ [...] return cpu; } acpi_parse_x2apic(union acpi_subtable_headers * header, const unsigned long end) { struct acpi_madt_local_x2apic *processor = NULL; processor = (struct acpi_madt_local_x2apic *)header; [...] enabled = processor->lapic_flags & ACPI_MADT_ENABLED; [...] /* don't register processors that cannot be onlined */ if (!acpi_is_processor_usable(processor->lapic_flags)) return 0; [...] acpi_register_lapic(apic_id, processor->uid, enabled); return 0; } On ARM, we similarly identify all MADT GICC entries which are *usable* i.e. either are *ENABLED* or *online-capable*. But Unlike x86, all cpus corresponding to usable MADT GICC entries gets logical cpu-ds allocated and their present bit mask set during boot itself. Hence, present mask is always equal to the possible cpus mask on ARM. https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags For online-capable but *not* enabled CPUs we defer the registration of the logical CPU-ids with the Linux Driver Model till the time ACPI Hotplug event occurs. This means register_cpu() is not called for the disabled CPUs during init time. Hence, sysfs entries for the disabled CPUs don’t exits. But above creates bit of confusion to a x86 accustomed users as on ARM with our solution, present CPUs are always equal to possible CPUs. $ cat /sys/devices/system/cpu/possible 0-5 $ cat /sys/devices/system/cpu/present 0-5 $ cat /sys/devices/system/cpu/online 0-1 $ cat /sys/devices/system/cpu/offline 2-5 There is no way to know which CPUs have been hotplugged using above interface. Hence, we have also a new mask of enabled CPUs in the $ cat /sys/devices/system/cpu/possible 0-5 $ cat /sys/devices/system/cpu/present 0-5 $ cat /sys/devices/system/cpu/enabled 0-2 $ cat /sys/devices/system/cpu/online 0-1 $ cat /sys/devices/system/cpu/offline 2-5 Qemu parameters: -smp cpu=3 maxcpus=6 Kernel parameter: maxcpus=2 > > Because it seems to me that _STA, having both enabled and present > bits, could already describe what we need here, and arguably, a CPU > that is not both present and enabled should not be used by the OS. > This would leave room for representing off-line CPUs as present but > not enabled. That is correct understanding. For plugged cpus: _STA.Present=1 and _STA.Enabled=1 For unplugged cpus: _STA.Present=1 and _STA.Enabled=0 Hot(un)plugging is only allowed if during boot the GICC entries were discovered as *online-capable*. GICC entries which are MADT GICC enabled during boot cannot be hot-unplugged either. Catch: If hot unplugging is to be supported for all cpus except the boot then we MUST set all CPUs except boot CPUs as *online-capable*. This poses compatibility problems with the legacy OS running over latest machines/platforms supporting Hotplug feature. OS might ignore all the online-capable bits during boot time and hence only 1 CPU i.e. boot cpus might appear. Hence, MADT.GICC.Enabled bits and MADT.GICC.online-capable need Not be mutually exclusive. This requires more discussions! You might find below useful: https://kvm-forum.qemu.org/2023/talk/9SMPDQ/ > > Apologies if I am missing something obvious here - the whole rationale > behind this thing is rather confusing to me.