On Wed, Nov 30, 2022 at 08:48:14AM -0800, Dave Hansen wrote: > On 11/30/22 07:53, Roger Pau Monné wrote: > > On Tue, Nov 29, 2022 at 09:43:53AM -0800, Dave Hansen wrote: > >> On 11/21/22 02:21, Roger Pau Monne wrote: > >>> When running as a Xen dom0 the number of CPUs available to Linux can > >>> be different from the number of CPUs present on the system, but in > >>> order to properly fetch processor performance related data _PDC must > >>> be executed on all the physical CPUs online on the system. > >> > >> How is the number of CPUs available to Linux different? > >> > >> Is this a result of the ACPI tables that dom0 sees being "wrong"? > > > > Depends on the mode. This is all specific to Linux running as a Xen > > dom0. > > > > For PV dom0 the ACPI tables that dom0 sees are the native ones, > > however available CPUs are not detected based on the MADT, but using > > hypercalls, see xen_smp_ops struct and the > > x86_init.mpparse.get_smp_config hook used in smp_pv.c > > (_get_smp_config()). > > > > For a PVH dom0 Xen provides dom0 with a crafted MADT table that does > > only contain the CPUs available to dom0, and hence is likely different > > from the native one present on the hardware. > > > > In any case, the dynamic tables dom0 sees where the Processor > > objects/devices reside are not modified by Xen in any way, so the ACPI > > Processors are always exposed to dom0 as present on the native > > tables. > > > > Xen cannot parse the dynamic ACPI tables (neither should it, since > > then it would act as OSPM), so it relies on dom0 to provide same data > > present on those tables for Xen to properly manage the frequency and > > idle states of the CPUs on the system. > > > >>> The current checks in processor_physically_present() result in some > >>> processor objects not getting their _PDC methods evaluated when Linux > >>> is running as Xen dom0. Fix this by introducing a custom function to > >>> use when running as Xen dom0 in order to check whether a processor > >>> object matches a CPU that's online. > >> > >> What is the end user visible effect of this problem and of the solution? > > > > Without this fix _PDC is only evaluated for the CPUs online from dom0 > > point of view, which means that if dom0 is limited to 8 CPUs but the > > system has 24 CPUs, _PDC will only get evaluated for 8 CPUs, and that > > can have the side effect of the data then returned by _PSD method or > > other methods being different between CPUs where _PDC was evaluated vs > > CPUs where the method wasn't evaluated. Such mismatches can > > ultimately lead to for example the CPU frequency driver in Xen not > > initializing properly because the coordination methods between CPUs on > > the same domain don't match. > > > > Also not evaluating _PDC prevents the OS (or Xen in this case) > > from notifying ACPI of the features it supports. > > > > IOW this fix attempts to make sure all physically online CPUs get _PDC > > evaluated, and in order to to that we need to ask the hypervisor if a > > Processor ACPI ID matches an online CPU or not, because Linux doesn't > > have that information when running as dom0. > > > > Hope the above makes sense and allows to make some progress on the > > issue, sometimes it's hard to summarize without getting too > > specific, > > Yes, writing changelogs is hard. :) > > Let's try though. I was missing some key pieces of background here. > Believe it or not, I had no idea off the top of my head what _PDC was or > why it's important. > > the information about _PDC being required on all processors was missing, > as was the information about the dom0's incomplete concept of the > available physical processors. > > == Background == > > In ACPI systems, the OS can direct power management, as opposed to the > firmware. This OS-directed Power Management is called OSPM. Part of > telling the firmware that the OS going to direct power management is > making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC > calls must be made on every processor. If these _PDC calls are not > completed on every processor it can lead to inconsistency and later > failures in things like the CPU frequency driver. I think the "on every processor" is not fully accurate. _PDC methods need to be evaluated for every Processor object. Whether that evaluation is executed on the physical processor that matches the ACPI UID of the object/device is not mandatory (iow: you can evaluate the _PDC methods of all Processor objects from the BSP). The usage of 'on' seems to me to note that the methods are executed on the matching physical processors. I would instead use: "... must be made for every processor. If these _PDC calls are not completed for every processor..." But I'm not a native English speaker, so this might all be irrelevant. > > In a Xen system, the dom0 kernel is responsible for system-wide power > management. The dom0 kernel is in charge of OSPM. However, the Xen > hypervisor hides some processors information from the dom0 kernel. This > is presumably done to ensure that the dom0 system has less interference > with guests that want to use the other processors. dom0 on a Xen system is just another guest, so the admin can limit the number of CPUs available to dom0, that's why we get into this weird situation. > == Problem == > > But, this leads to a problem: the dom0 kernel needs to run _PDC on all > the processors, but it can't always see them. > > == Solution == > > In dom0 kernels, ignore the existing ACPI method for determining if a > processor is physically present because it might not be accurate. > Instead, ask the hypervisor for this information. > > This ensures that ... > > ---- > > Is that about right? Yes, I think it's accurate. I will add to my commit log, thanks! On the implementation side, is the proposed approach acceptable? Mostly asking because it adds Xen conditionals to otherwise generic ACPI code. Thanks, Roger.