RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Engaging after a long time (I've been almost off grid due to very challenging personal circumstances).
Please find my response inline.

Thanks
Salil

>  From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
>  Sent: Friday, April 12, 2024 7:31 PM
>  To: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
>  Cc: linux-pm@xxxxxxxxxxxxxxx; loongarch@xxxxxxxxxxxxxxx; linux-
>  acpi@xxxxxxxxxxxxxxx; linux-arch@xxxxxxxxxxxxxxx; linux-
>  kernel@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
>  kvmarm@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; Russell King
>  <linux@xxxxxxxxxxxxxxx>; Rafael J . Wysocki <rafael@xxxxxxxxxx>; Miguel
>  Luis <miguel.luis@xxxxxxxxxx>; James Morse <james.morse@xxxxxxx>;
>  Salil Mehta <salil.mehta@xxxxxxxxxx>; Jean-Philippe Brucker <jean-
>  philippe@xxxxxxxxxx>; Catalin Marinas <catalin.marinas@xxxxxxx>; Will
>  Deacon <will@xxxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx>;
>  justin.he@xxxxxxx; jianyong.wu@xxxxxxx
>  Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from
>  acpi_processor_get_info()
>  
>  On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
>  <Jonathan.Cameron@xxxxxxxxxx> wrote:
>  >
>  > From: James Morse <james.morse@xxxxxxx>
>  >
>  > The arm64 specific arch_register_cpu() call may defer CPU registration
>  > until the ACPI interpreter is available and the _STA method can be
>  > evaluated.
>  >
>  > If this occurs, then a second attempt is made in
>  > acpi_processor_get_info(). Note that the arm64 specific call has not
>  > yet been added so for now this will never be successfully called.
>  >
>  > Systems can still be booted with 'acpi=off', or not include an ACPI
>  > description at all as in these cases arch_register_cpu() will not have
>  > deferred registration when first called.
>  >
>  > This moves the CPU register logic back to a subsys_initcall(), while
>  > the memory nodes will have been registered earlier.
>  > Note this is where the call was prior to the cleanup series so there
>  > should be no side effects of moving it back again for this specific
>  > case.
>  >
>  > [PATCH 00/21] Initial cleanups for vCPU HP.
>  >
>  https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@xxxxxxxxxxxxxxxxxxxxx/
>  >
>  > e.g. 5b95f94c3b9f ("x86/topology: Switch over to
>  GENERIC_CPU_DEVICES")
>  >
>  > Signed-off-by: James Morse <james.morse@xxxxxxx>
>  > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx>
>  > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx>
>  > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx>
>  > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx>
>  > Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>
>  > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>  > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>  > ---
>  > v5: Update commit message to make it clear this is moving the
>  >     init back to where it was until very recently.
>  >
>  >     No longer change the condition in the earlier registration point
>  >     as that will be handled by the arm64 registration routine
>  >     deferring until called again here.
>  > ---
>  >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  >  1 file changed, 12 insertions(+)
>  >
>  > diff --git a/drivers/acpi/acpi_processor.c
>  > b/drivers/acpi/acpi_processor.c index 93e029403d05..c78398cdd060
>  > 100644
>  > --- a/drivers/acpi/acpi_processor.c
>  > +++ b/drivers/acpi/acpi_processor.c
>  > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct
>  > acpi_device *device)
>  >
>  >         c = &per_cpu(cpu_devices, pr->id);
>  >         ACPI_COMPANION_SET(&c->dev, device);
>  > +       /*
>  > +        * Register CPUs that are present. get_cpu_device() is used to skip
>  > +        * duplicate CPU descriptions from firmware.
>  > +        */
>  > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
>  > +           !get_cpu_device(pr->id)) {
>  > +               int ret = arch_register_cpu(pr->id);
>  > +
>  > +               if (ret)
>  > +                       return ret;
>  > +       }
>  > +
>  >         /*
>  >          *  Extra Processor objects may be enumerated on MP systems with
>  >          *  less than the max # of CPUs. They should be ignored _iff
>  > --
>  
>  I am still unsure why there need to be two paths calling
>  arch_register_cpu() in acpi_processor_get_info().


This is because all CPUs are expected to be 'present' during the boot time for ARM64 arch.
This is not true for x86 world i.e. the logical_cpuid could  be invalid (and present mask not
set) for the x86 arch during the boot time.  Faking the 'present' behavior at the virtualizer
level for ARM is like interfering with the architecture and then tweaking the kernel to fit
that unauthorized hack. This has a potential to break the existing and future version of the
ARM arch. (Between, I'm one of the initial offender of doing that but later corrected the
approach after many discussions and KVM Forum presentations along with ARM )

Therefore, in ARM we keep all the processor as present and just use _STA enabled bit to
decide the online'ing of the processor and this requires a separate handling.


>  
>  Just below the comment partially pulled into the patch context above, there
>  is this code:
>  
>  if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>           int ret = acpi_processor_hotadd_init(pr);
>  
>          if (ret)
>                  return ret;
>  }
>  
>  For the sake of the argument, fold acpi_processor_hotadd_init() into it and
>  drop the redundant _STA check from it:
>  
>  if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>          if (invalid_phys_cpuid(pr->phys_id))
>                  return -ENODEV;
>  
>          cpu_maps_update_begin();
>          cpus_write_lock();
>  
>         ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
>         if (ret) {
>                  cpus_write_unlock();
>                  cpu_maps_update_done();
>                  return ret;
>         }
>         ret = arch_register_cpu(pr->id);
>         if (ret) {
>                  acpi_unmap_cpu(pr->id);
>  
>                  cpus_write_unlock();
>                  cpu_maps_update_done();
>                  return ret;
>         }
>        pr_info("CPU%d has been hot-added\n", pr->id);
>        pr->flags.need_hotplug_init = 1;
>  
>        cpus_write_unlock();
>        cpu_maps_update_done();
>  }
>  
>  so I'm not sure why this cannot be combined with the new code.
>  
>  Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.


We cannot because logical cpu-id can never be invalid and cpus can
never be in NOT present state on ARM arch.


>  What's the difference then?  


Above is the precise difference. Changing the behavior of 'presence' in
the ARM architecture after boot is not allowed. With the latest efforts, we
have added the concept of 'online-capable' bit which can help in defer
online'ing the CPUs but then this is not same as not being present at the
boot time. 


The locking, which should be fine if I'm not
>  mistaken and need_hotplug_init that needs to be set if this code runs after
>  the processor driver has loaded AFAICS.

AFAICS, Locking looks to be okay to me as well.

Best regards
Salil.





[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux