AFAIK no Intel CPU has ever had that behavior, and always cleared the segments; I don't Intel has any plans of supporting such a CPUID bit (although I'd certainly be willing to take such a request back to the CPU teams on request.) That being said, this sounds like an ideal use for the hypervisor CPU feature flag. Maybe we should consider a migration hypervisor flag too to explicitly tell the kernel not to rely on hardware probing that breaks migration in general. Now, with a CPUID but being introduced, the right thing would be to use the CPUID bit as a feature instead of using a bug flag, and add whitelisting in the vendor-specific code as applicable. On October 18, 2021 11:17:30 AM PDT, Borislav Petkov <bp@xxxxxxxxx> wrote: >On Wed, Oct 13, 2021 at 03:22:30PM +0100, Jane Malalane wrote: >> @@ -650,6 +651,27 @@ static void early_init_amd(struct cpuinfo_x86 *c) >> if (c->x86_power & BIT(14)) >> set_cpu_cap(c, X86_FEATURE_RAPL); >> >> + /* >> + * Zen1 and earlier CPUs don't clear segment base/limits when >> + * loading a NULL selector. This has been designated >> + * X86_BUG_NULL_SEG. >> + * >> + * Zen3 CPUs advertise Null Selector Clears Base in CPUID. >> + * Zen2 CPUs also have this behaviour, but no CPUID bit. >> + * >> + * A hypervisor may sythesize the bit, but may also hide it >> + * for migration safety, so we must not probe for model >> + * specific behaviour when virtualised. >> + */ >> + if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6)) >> + nscb = true; >> + >> + if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !nscb && c->x86 == 0x17) >> + nscb = check_null_seg_clears_base(c); >> + >> + if (!nscb) >> + set_cpu_bug(c, X86_BUG_NULL_SEG); >> + >> #ifdef CONFIG_X86_64 >> set_cpu_cap(c, X86_FEATURE_SYSCALL32); >> #else > >Can we do something like this? > >It is carved out into a separate function which you can simply call from >early_init_amd() and early_init_hygon(). > >I guess you can put that function in arch/x86/kernel/cpu/common.c or so. > >Then, you should put the comments right over the code like I've done >below so that one can follow what's going on with each particular check. > >I've also flipped the logic a bit and it is simpler this way. > >Totally untested of course. > >static void early_probe_null_seg_clearing_base(struct cpuinfo_x86 *c) >{ > /* > * A hypervisor may sythesize the bit, but may also hide it > * for migration safety, so do not probe for model-specific > * behaviour when virtualised. > */ > if (cpu_has(c, X86_FEATURE_HYPERVISOR)) > return; > > /* Zen3 CPUs advertise Null Selector Clears Base in CPUID. */ > if (c->extended_cpuid_level >= 0x80000021 && cpuid_eax(0x80000021) & BIT(6)) > return; > > /* Zen2 CPUs also have this behaviour, but no CPUID bit. */ > if (c->x86 == 0x17 && check_null_seg_clears_base(c)) > return; > > /* All the remaining ones are affected */ > set_cpu_bug(c, X86_BUG_NULL_SEG); >} > >> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c >> index 0f8885949e8c..2ca4afb97247 100644 >> --- a/arch/x86/kernel/cpu/common.c >> +++ b/arch/x86/kernel/cpu/common.c >> @@ -1395,7 +1395,7 @@ void __init early_cpu_init(void) >> early_identify_cpu(&boot_cpu_data); >> } >> >> -static void detect_null_seg_behavior(struct cpuinfo_x86 *c) >> +bool check_null_seg_clears_base(struct cpuinfo_x86 *c) >> { >> #ifdef CONFIG_X86_64 >> /* >> @@ -1418,10 +1418,10 @@ static void detect_null_seg_behavior(struct cpuinfo_x86 *c) >> wrmsrl(MSR_FS_BASE, 1); >> loadsegment(fs, 0); >> rdmsrl(MSR_FS_BASE, tmp); >> - if (tmp != 0) >> - set_cpu_bug(c, X86_BUG_NULL_SEG); >> wrmsrl(MSR_FS_BASE, old_base); >> + return tmp == 0; >> #endif >> + return true; >> } >> >> static void generic_identify(struct cpuinfo_x86 *c) >> @@ -1457,8 +1457,6 @@ static void generic_identify(struct cpuinfo_x86 *c) >> >> get_model_name(c); /* Default name */ >> >> - detect_null_seg_behavior(c); >> - >> /* >> * ESPFIX is a strange bug. All real CPUs have it. Paravirt >> * systems that run Linux at CPL > 0 may or may not have the > >So this function is called on all x86 CPUs. Are you sure others besides >AMD and Hygon do not have the same issue? > >IOW, I wouldn't remove that call here. > >But then this is the identify() phase in the boot process and you've >moved it to early_identify() by putting it in the ->c_early_init() >function pointer on AMD and Hygon. > >Is there any particular reasoning for this or can that detection remain >in ->c_identify()? > >Because if this null seg behavior detection should happen on all >CPUs - and I think it should, because, well, it has been that way >until now - then the vendor specific identification minus what >detect_null_seg_behavior() does should run first and then after >->c_identify() is done, you should do something like: > > if (!cpu_has_bug(c, X86_BUG_NULL_SEG)) { > if (!check_null_seg_clears_base(c)) > set_cpu_bug(c, X86_BUG_NULL_SEG); > } > >so that it still takes place on all CPUs. > >I.e., you should split the detection. > >I hope I'm making sense ... > >Ah, btw, that @c parameter to detect_null_seg_behavior() is unused - you >should remove it in a pre-patch. > >Thx. > -- Sent from my Android device with K-9 Mail. Please excuse my brevity.