On Tue, Aug 13, 2024 at 04:00:09PM +0800, Yuan Yao <yuan.yao@xxxxxxxxxxxxxxx> wrote: > > +/* VMM can pass one 64bit auxiliary data to vcpu via RCX for guest BIOS. */ > > +static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u64 vcpu_rcx) > > +{ > > + const struct tdx_sysinfo_module_info *modinfo = &tdx_sysinfo->module_info; > > + struct vcpu_tdx *tdx = to_tdx(vcpu); > > + unsigned long va; > > + int ret, i; > > + u64 err; > > + > > + if (is_td_vcpu_created(tdx)) > > + return -EINVAL; > > + > > + /* > > + * vcpu_free method frees allocated pages. Avoid partial setup so > > + * that the method can't handle it. > > + */ > > This looks not that clear, why vcpu_free can't handle it is not explained. > > Looking the whole function, page already added into TD by > SEAMCALL should be cleared before free back to kernel, > tdx_vcpu_free() can handle them. Other pages can be freed > directly and can't be handled by tdx_vcpu_free() because > they're not added into TD. Is this right understanding ? Yes. If we result in error in the middle of TDX vCPU initialization, TDH.MEM.PAGE.RECLAIM() result in error due to TDX module state check. TDX module seems to assume that we don't fail in the middle of TDX vCPU initialization. Maybe we can add WARN_ON_ONCE() for such cases. > > + ret = -EIO; > > + pr_tdx_error(TDH_VP_CREATE, err); > > + goto free_tdvpx; > > + } > > + > > + for (i = 0; i < tdx_sysinfo_nr_tdcx_pages(); i++) { > > + va = __get_free_page(GFP_KERNEL_ACCOUNT); > > + if (!va) { > > + ret = -ENOMEM; > > + goto free_tdvpx; > > It's possible that some pages already added into TD by > tdh_vp_addcx() below and they won't be handled by > tdx_vcpu_free() if goto free_tdvpx here; Due to TDX TD state check, we can't free partially assigned TDCS pages. TDX module seems to assume that TDH.VP.ADDCX() won't fail in the middle. > > + else > > + err = tdh_vp_init(tdx, vcpu_rcx); > > + > > + if (KVM_BUG_ON(err, vcpu->kvm)) { > > + pr_tdx_error(TDH_VP_INIT, err); > > + return -EIO; > > + } > > + > > + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; > > + tdx->td_vcpu_created = true; > > + > > + return 0; > > + > > +free_tdvpx: > > How about s/free_tdvpx/free_tdcx > > In 1.5 TDX spec these pages are all called TDCX pages, and > the function context already indicates that we're talking about > vcpu's TDCX pages. Oops, this is left over when tdvpx was converted to tdcs. > > +static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) > > +{ > > + struct msr_data apic_base_msr; > > + struct vcpu_tdx *tdx = to_tdx(vcpu); > > + int ret; > > + > > + if (cmd->flags) > > + return -EINVAL; > > + if (tdx->initialized) > > + return -EINVAL; > > + > > + /* > > + * As TDX requires X2APIC, set local apic mode to X2APIC. User space > > + * VMM, e.g. qemu, is required to set CPUID[0x1].ecx.X2APIC=1 by > > + * KVM_SET_CPUID2. Otherwise kvm_set_apic_base() will fail. > > + */ > > + apic_base_msr = (struct msr_data) { > > + .host_initiated = true, > > + .data = APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC | > > + (kvm_vcpu_is_reset_bsp(vcpu) ? MSR_IA32_APICBASE_BSP : 0), > > + }; > > + if (kvm_set_apic_base(vcpu, &apic_base_msr)) > > + return -EINVAL; > > + > > + ret = tdx_td_vcpu_init(vcpu, (u64)cmd->data); Because we set guest rcx only, we use cmd->data. Can we add reserved area for future use similar to struct kvm_tdx_init_vm? i.e. introduce something like struct kvm_tdx_init_vcpu {u64 rcx; u64 reserved[]; } -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>