On Wed, Feb 26, 2020 at 12:41:09PM -0800, Jacob Keller wrote: > On 2/25/2020 4:58 PM, Jacob Keller wrote: > > On 2/25/2020 4:42 PM, Sean Christopherson wrote>> So there's something > > weird going on. Maybe "boot_cpu_has" in the > >>> vmx_disabled_by_bios is wrong? Hmm. > >> > >> Hmm, perhaps a bug somewhere else is overwriting the cpufeatures bit for > >> X86_FEATURE_VMX. Let me see if I can reproduce from net-next. > >> > > > > If you have any further suggestions for debugging, I'm happy to help try > > to figure this out. To my eyes, it looks like somehow bits get reset... > > It's definitely not clear to me how this happens. > > > > There is the get_cpu_caps call.. but that seems to correctly call > > apply_forced_caps at the end. > > > > That's all I have time for today. > > > > Thanks, > > Jake > > > > Hi, > > I kept digging into this, and I added a further print to the get_cpu_cap > function. > > It looks like get_cpu_cap is being called again *after* > init_ia32_feat_ctl... > > Digging further, I discovered this appears to be the call in setup_pku, > which would only be enabled for systems which have X86_FEATURE_PKU > enabled and supported. It's quite likely that test systems may not have > had this feature, hence why it went undetected till now. Ya, probably not a whole lot of folks with Icelake silicon and VMX disabled in BIOS. I'll see if I can reproduce on my ICX system, that would make testing a fix a little easier. > Because of the extra get_cpu_cap call, the capabilities are reset. Since > we never use setup_clear_cpu_cap or pass NULL to clear_cpu_cap, the code > that sets the global cpu_caps_cleared bits is not run. > > It's not clear to me what the best fix for this is. > > Perhaps init_ia32_feat_ctl should be something run during > early_identify_cpu, since it's really checking global status (rdmsr), > and not per-CPU status. And then it could directly operate to call > setup_clear_cpu_cap, which would properly clear the bit globally, > ensuring that apply_forced_caps kicks in? > > Or this needs to somehow be run *after* setup_pku? But that doesn't feel > very robust. Bummer. Using clear_cpu_cap() instead of setup_clear_cpu_cap() was me being fancy and trying to allow KVM to identify the case where VMX is available and configured on some CPUs but not all. I'll work on a fix.