On Tue, 2023-02-07 at 15:24 +0100, Thomas Gleixner wrote: > On Tue, Feb 07 2023 at 11:27, David Woodhouse wrote: > > On Tue, 2023-02-07 at 10:57 +0000, David Woodhouse wrote: > > > • This CPU was present but no other CPU in this cluster was actually > > > brought up at boot time so the cluster_mask wasn't allocated. > > > > > > The code looks right, I don't grok the comment about partial clusters > > > and virtualization, and would have worded it something along the above > > > lines? > > > > As I get my head around that, I think the code needs to change too. > > What if we *unplug* the only CPU in a cluster (present→possible), then > > add a new one in the same cluster? The new one would get a new > > cluster_mask. Which is kind of OK for now but then if we re-add the > > original CPU it'd continue to use its old cluster_mask. > > Indeed. > > > Now, that's kind of weird if it's physical CPUs because that cluster is > > within a given chip, isn't it? But with virtualization maybe that's > > something that could happen, and it doesn't hurt to be completely safe > > by using for_each_possible_cpu() instead? > > Yes. Virtualization does aweful things.... > > > Now looks like this: > > /* > > * On post boot hotplug for a CPU which was not present at boot time, > > * iterate over all possible CPUs (even those which are not present > > * any more) to find any existing cluster mask. > > */ > > for_each_possible_cpu(cpu_i) { > > Looks good! Thanks. I've reworked and I think I've caught everything. Didn't want to elide the credit where Usama had done some of the forward-porting work, so I've left those notes and the SoB intact on those patches, on the assumption that they will be reposting the series after proper testing on hardware again anyway (I'm only spawning it in qemu right now). https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/parallel-6.2-rc7 The only real code change other than what we've discussed here is to implement what we talked about for CPUID 0xb vs. 0x1 etc: /* * We can do 64-bit AP bringup in parallel if the CPU reports * its APIC ID in CPUID (either leaf 0x0B if we need the full * APIC ID in X2APIC mode, or leaf 0x01 if 8 bits are * sufficient). Otherwise it's too hard. And not for SEV-ES * guests because they can't use CPUID that early. */ if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 1 || (x2apic_mode && boot_cpu_data.cpuid_level < 0xb) || cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) do_parallel_bringup = false; if (do_parallel_bringup && x2apic_mode) { unsigned int eax, ebx, ecx, edx; /* * To support parallel bringup in x2apic mode, the AP will need * to obtain its APIC ID from CPUID 0x0B, since CPUID 0x01 has * only 8 bits. Check that it is present and seems correct. */ cpuid_count(0xb, 0, &eax, &ebx, &ecx, &edx); /* * AMD says that if executed with an umimplemented level in * ECX, then it will return all zeroes in EAX. Intel says it * will return zeroes in both EAX and EBX. Checking only EAX * should be sufficient. */ if (eax) { smpboot_control = STARTUP_SECONDARY | STARTUP_APICID_CPUID_0B; } else { pr_info("Disabling parallel bringup because CPUID 0xb looks untrustworthy\n"); do_parallel_bringup = false; } } else if (do_parallel_bringup) { /* Without X2APIC, what's in CPUID 0x01 should suffice. */ smpboot_control = STARTUP_SECONDARY | STARTUP_APICID_CPUID_01; }
Attachment:
smime.p7s
Description: S/MIME cryptographic signature