On 6/03/25 20:19, Paolo Bonzini wrote: > On 2/27/25 19:37, Adrian Hunter wrote: >> On 25/02/25 08:15, Xiaoyao Li wrote: >>> On 2/24/2025 8:27 PM, Adrian Hunter wrote: >>>> On 20/02/25 15:16, Xiaoyao Li wrote: >>>>> On 1/29/2025 5:58 PM, Adrian Hunter wrote: >>>>>> +#define TDX_REGS_UNSUPPORTED_SET (BIT(VCPU_EXREG_RFLAGS) | \ >>>>>> + BIT(VCPU_EXREG_SEGMENTS)) >>>>>> + >>>>>> +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) >>>>>> +{ >>>>>> + /* >>>>>> + * force_immediate_exit requires vCPU entering for events injection with >>>>>> + * an immediately exit followed. But The TDX module doesn't guarantee >>>>>> + * entry, it's already possible for KVM to_think_ it completely entry >>>>>> + * to the guest without actually having done so. >>>>>> + * Since KVM never needs to force an immediate exit for TDX, and can't >>>>>> + * do direct injection, just warn on force_immediate_exit. >>>>>> + */ >>>>>> + WARN_ON_ONCE(force_immediate_exit); >>>>>> + >>>>>> + trace_kvm_entry(vcpu, force_immediate_exit); >>>>>> + >>>>>> + tdx_vcpu_enter_exit(vcpu); >>>>>> + >>>>>> + vcpu->arch.regs_avail &= ~TDX_REGS_UNSUPPORTED_SET; >>>>> >>>>> I don't understand this. Why only clear RFLAGS and SEGMENTS? >>>>> >>>>> When creating the vcpu, vcpu->arch.regs_avail = ~0 in kvm_arch_vcpu_create(). >>>>> >>>>> now it only clears RFLAGS and SEGMENTS for TDX vcpu, which leaves other bits set. But I don't see any code that syncs the guest value of into vcpu->arch.regs[reg]. >>>> >>>> TDX guest registers are generally not known but >>>> values are placed into vcpu->arch.regs when needed >>>> to work with common code. >>>> >>>> We used to use ~VMX_REGS_LAZY_LOAD_SET and tdx_cache_reg() >>>> which has since been removed. >>>> >>>> tdx_cache_reg() did not support RFLAGS, SEGMENTS, >>>> EXIT_INFO_1/EXIT_INFO_2 but EXIT_INFO_1/EXIT_INFO_2 became >>>> needed, so that just left RFLAGS, SEGMENTS. >>> >>> Quote what Sean said [1] >>> >>> “I'm also not convinced letting KVM read garbage for RIP, RSP, CR3, or >>> PDPTRs is at all reasonable. CR3 and PDPTRs should be unreachable, >>> and I gotta imagine the same holds true for RSP. Allow reads/writes >>> to RIP is fine, in that it probably simplifies the overall code.” >>> >>> We need to justify why to let KVM read "garbage" of VCPU_REGS_RIP, >>> VCPU_EXREG_PDPTR, VCPU_EXREG_CR0, VCPU_EXREG_CR3, VCPU_EXREG_CR4, >>> VCPU_EXREG_EXIT_INFO_1, and VCPU_EXREG_EXIT_INFO_2 are neeed. >>> >>> The changelog justify nothing for it. >> >> Could add VCPU_REGS_RIP, VCPU_REGS_RSP, VCPU_EXREG_CR3, VCPU_EXREG_PDPTR. >> But not VCPU_EXREG_CR0 nor VCPU_EXREG_CR4 since we started using them. > > Hi Adrian, > > how is CR0 used? And CR4 is only used other than for loading the XSAVE state, I think? I meant it is used in the sense that patch "[PATCH V2 07/12] KVM: TDX: restore host xsave state when exit from the guest TD" provides a value for it. But it looks like it might be accessible via: store_regs() __get_sregs() __get_sregs_common() Sean wanted a maximal CR0 value consistent with the CR4. CR4 is also being used in kvm_update_cpuid_runtime(). > > I will change this to a list of specific available registers instead of using "&= ~", and it would be even better if CR0/CR4 are not on the list. > > Paolo > >>> btw, how EXIT_INFO_1/EXIT_INFO_2 became needed? It seems I cannot find any TDX code use them. >> >> vmx_get_exit_qual() / vmx_get_intr_info() are now used by TDX. >> >>> >>> [1] https://lore.kernel.org/all/Z2GiQS_RmYeHU09L@xxxxxxxxxx/ >>> >>>>> >>>>>> + trace_kvm_exit(vcpu, KVM_ISA_VMX); >>>>>> + >>>>>> + return EXIT_FASTPATH_NONE; >>>>>> +} >>>>> >>>> >>> >> >> >