Re: [PATCH v19 037/130] KVM: TDX: Make KVM_CAP_MAX_VCPUS backend specific

Isaku Yamahata <isaku.yamahata@xxxxxxxxx> · Thu, 9 May 2024 16:55:22 -0700

On Fri, May 10, 2024 at 11:19:44AM +1200,
"Huang, Kai" <kai.huang@xxxxxxxxx> wrote:

> 
> 
> On 10/05/2024 10:52 am, Sean Christopherson wrote:
> > On Fri, May 10, 2024, Kai Huang wrote:
> > > On 10/05/2024 4:35 am, Sean Christopherson wrote:
> > > > KVM x86 limits KVM_MAX_VCPUS to 4096:
> > > > 
> > > >     config KVM_MAX_NR_VCPUS
> > > > 	int "Maximum number of vCPUs per KVM guest"
> > > > 	depends on KVM
> > > > 	range 1024 4096
> > > > 	default 4096 if MAXSMP
> > > > 	default 1024
> > > > 	help
> > > > 
> > > > whereas the limitation from TDX is apprarently simply due to TD_PARAMS taking
> > > > a 16-bit unsigned value:
> > > > 
> > > >     #define TDX_MAX_VCPUS  (~(u16)0)
> > > > 
> > > > i.e. it will likely be _years_ before TDX's limitation matters, if it ever does.
> > > > And _if_ it becomes a problem, we don't necessarily need to have a different
> > > > _runtime_ limit for TDX, e.g. TDX support could be conditioned on KVM_MAX_NR_VCPUS
> > > > being <= 64k.
> > > 
> > > Actually later versions of TDX module (starting from 1.5 AFAICT), the module
> > > has a metadata field to report the maximum vCPUs that the module can support
> > > for all TDX guests.
> > 
> > My quick glance at the 1.5 source shows that the limit is still effectively
> > 0xffff, so again, who cares?  Assert on 0xffff compile time, and on the reported
> > max at runtime and simply refuse to use a TDX module that has dropped the minimum
> > below 0xffff.
> 
> I need to double check why this metadata field was added.  My concern is in
> future module versions they may just low down the value.

TD partitioning would reduce it much.

> But another way to handle is we can adjust code when that truly happens?
> Might not be ideal for stable kernel situation, though?
>
> > > And we only allow the kvm->max_vcpus to be updated if it's a TDX guest in
> > > the vt_vm_enable_cap().  The reason is we want to avoid unnecessary change
> > > for normal VMX guests.
> > 
> > That's a frankly ridiculous reason to bury code in TDX.  Nothing is _forcing_
> > userspace to set KVM_CAP_MAX_VCPUS, i.e. there won't be any change to VMX VMs
> > unless userspace _wants_ there to be a change.
> 
> Right.  Anyway allowing userspace to set KVM_CAP_MAX_VCPUS for non-TDX
> guests shouldn't have any issue.
> 
> The main reason to bury code in TDX is it needs to additionally check
> tdx_info->max_vcpus_per_td.  We can just do in common code if we avoid that
> TDX specific check.

So we can make it arch-independent.

When creating VM, we can set kvm->max_vcpu = tdx_info->max_vcpus_per_td by
tdx_vm_init().  The check can be common like
"if (new max_vcpu > kvm->max_vcpu) error".

Or we can add kvm->hard_max_vcpu or something,  arch-common code can have
"if (kvm->hard_max_vcpu && new max_vcpu > kvm->hard_max_vcpu) error".
-- 
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>