On 9/30/24 22:00, Xin Li (Intel) wrote: > The per CPU variable cea_exception_stacks contains per CPU stacks for > NMI, #DB and #DF, which is referenced in KVM to set host FRED RSP[123] > each time a vCPU is loaded onto a CPU, thus it needs to be exported. Nit: It's not obvious how 'cea_exception_stacks' get used in this series. It's never referenced explicitly. I did figure it out by looking for 'RSP[123]' references, but a much better changelog would be something like: The per CPU array 'cea_exception_stacks' points to per CPU stacks for NMI, #DB and #DF. It is normally referenced via the #define: __this_cpu_ist_top_va(). FRED introduced new fields in the host-state area of the VMCS for stack levels 1->3 (HOST_IA32_FRED_RSP[123]). KVM must populate these each time a vCPU is loaded onto a CPU. See how that explicitly gives the reader greppable strings for "__this_cpu_ist_top_va" and "HOST_IA32_FRED_RSP"? That makes it much easier to figure out what is going on. I was also momentarily confused about why these loads need to be done on _every_ vCPU load. I think it's because the host state can change as the vCPU moves around to different physical CPUs and __this_cpu_ist_top_va() can and will change. But it's a detail that I think deserves to be explained in the changelog. There is also this note in vmx_vcpu_load_vmcs(): > /* > * Linux uses per-cpu TSS and GDT, so set these when switching > * processors. See 22.2.4. > */ which makes me think that it might not be bad to pull *all* of the per-cpu VMCS field population code out into a helper since the reasoning of why these need to be repopulated is identical. Also, what's the purpose of clearing GUEST_IA32_FRED_RSP[123] at init_vmcs() time? I would have thought that those values wouldn't matter until the VMCS gets loaded at vmx_vcpu_load_vmcs() when they are overwritten anyway. Or, I could be just totally misunderstanding how KVM consumes the VMCS. :)