RE: [PATCH v2 04/21] x86/virt/tdx: Add skeleton for detecting and initializing TDX on demand

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Huang, Kai <kai.huang@xxxxxxxxx>
> Sent: Tuesday, March 29, 2022 6:55 AM
> 
> On Tue, 2022-03-29 at 00:47 +1300, Tian, Kevin wrote:
> > > From: Huang, Kai <kai.huang@xxxxxxxxx>
> > > Sent: Monday, March 28, 2022 5:24 PM
> > > >
> > > > cpu_present_mask does not always represent BIOS-enabled CPUs due
> > > > to those boot options. Then why do we care whether CPUs in this mask
> > > > (if only representing a subset of BIOS-enabled CPUs) are at least
> brought
> > > > up once? It will fail at TDH.SYS.CONFIG anyway.
> > >
> > > As I said, this is used to make sure SEAMRR has been detected on all cpus,
> > > so
> > > that any BIOS misconfiguration on SEAMRR has been detected.
> Otherwise,
> > > seamrr_enabled() may not be reliable (theoretically).
> >
> > *all cpus* is questionable.
> >
> > Say BIOS enabled 8 CPUs: [0 - 7]
> >
> > cpu_present_map covers [0 - 5], due to nr_cpus=6
> >
> > You compared cpus_booted_once_mask to cpu_present_mask so if
> maxcpus
> > is set to a number < nr_cpus SEAMRR is considered disabled because you
> > cannot verify CPUs between [max_cpus, nr_cpus).
> 
> SEAMRR is not considered as disabled in this case, at least in my intention.

the function is called seamrr_enabled(), and false is returned if above
check is not passed. So it is the intention from the code.

> My
> understanding on the spec is if SEAMRR is configured as enabled on one core
> (the
> SEAMRR MSRs are core-scope), the SEAMCALL instruction can work on that
> core.  It
> is TDX's requirement that some SEAMCALL needs to be done on all BIOS-
> enabled
> CPUs to finish TDX initialization, but not SEAM's.
> 
> From this perspective, if we forget TDX at this moment but talk about SEAM
> alone, it might make sense to not just treat SEAMRR as disabled if kernel
> usable
> cpus are limited by 'nr_cpus'.  The chance that BIOS misconfigured SEAMRR is
> really rare.  If kernel can detect potential BIOS misconfiguration, it should do
> it.  Otherwise, perhaps it's more reasonable not to just treat SEAM as
> disabled.

My problem is just that you didn't adopt consistency policy for CPUs in
[maxcpus, nr_cpus) and CPUs in [nr_cpus, nr_bios_enabled_cpus). This is
the only trouble to me no matter what policy you want to pursue...

> 
> 
> > If following the same
> > rationale then you also need a proper way to detect the case where
> nr_cpus
> > < BIOS enabled number i.e. when you cannot verify SEAMRR on CPUs
> > between [nr_cpus, 7]. otherwise this check is just incomplete.
> >
> > But the entire check is actually unnecessary. You just need to verify
> SEAMRR
> > and do TDX cpu init on online CPUs. Any gap between online ones and BIOS
> > enabled ones will be caught by the TDX module at TDH.SYS.CONFIG point.
> 
> This is equivalent to not having the paranoid check in seamrr_enabled(). By
> detecting SEAMRR in identify_cpu(), the detection has already been done for
> any
> online cpu.
> 
> >
> > >
> > > Alternatively, I think we can also add check to disable TDX when
> 'maxcpus'
> > > has
> > > been specified, but I think the current way is better.
> > >
> > > >
> > > > btw your comment said that 'maxcpus' is basically an invalid mode
> > > > due to MCE broadcase problem. I didn't find any code to block it when
> > > > MCE is enabled,
> > >
> > > Please see below comment in cpu_smt_allowed():
> > >
> > > static inline bool cpu_smt_allowed(unsigned int cpu)
> > > {
> > >       ...
> > >         /*
> > >          * On x86 it's required to boot all logical CPUs at least once so
> > >          * that the init code can get a chance to set CR4.MCE on each
> > >          * CPU. Otherwise, a broadcasted MCE observing CR4.MCE=0b on
> any
> > >          * core will shutdown the machine.
> > >          */
> > >        return !cpumask_test_cpu(cpu, &cpus_booted_once_mask);
> > > }
> >
> > I saw that code. My point is more about your statement that maxcpus
> > is almost invalid due to above situation then why didn't we do anything
> > to document such restriction or throw out a warning when it's
> > misconfigured...
> 
> The sentence "'maxcpus' is an invalid operation mode due to the MCE
> broadcast
> problem" was from Thomas.  Perhaps I should not just put it into the
> comment.
> 
> Also, Thomas suggested:
> 
> "you should have a paranoia check which checks for the maxcpus
> command line parameter and if it's there and smaller than the number of
> present CPUs then you just refuse to enable TDX.
> 
> Alternatively you just have a separate cpumask tdx_availabe_mask and
> keep track of the CPUs which have been checked. When TDX is initialized
> you then can do:
> 
>     if (!cpumask_equal(cpu_present_mask, tdx_available_mask))
>     	     return;
> "
> 
> I found we can just use cpus_booted_once_mask, instead of
> tdx_available_mask, so
> I used the second way.  And instead of putting the check when initializing TDX,
> I put to seamrr_enabled() since I guess it's more reasonable to be here as the
> logic is to make sure SEAMRR has been detected on all cpus.

I'm not sure whether Thomas's comment just takes maxcpus as an example
which should be extended to other boot options like nr_cpus or he really
only cares about maxcpus. 

> 
> Hi Thomas,
> 
> If you see this, sorry for quoting your words here.  Just want to have a better
> discussion.  And appreciate if you can have some guidance here.
> 
> >
> > >
> > > > thus wonder the rationale behind and whether that
> > > > rationale can be brought to this series (i.e. no check against those
> > > > conflicting boot options and just let SEAMCALL itself to detect and fail).
> > > >
> > > > @Thomas, any guidance here?
> > > >
> > > > Thanks
> > > > Kevin
> >





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux