On Wed, 2022-11-23 at 12:05 +0100, Thomas Gleixner wrote: > Kai! > > On Wed, Nov 23 2022 at 00:30, Kai Huang wrote: > > On Tue, 2022-11-22 at 21:03 +0100, Thomas Gleixner wrote: > > > Clearly that's nowhere spelled out in the documentation, but I don't > > > buy the 'architecturaly required' argument not at all. It's an > > > implementation detail of the TDX module. > > > > I agree on hardware level there shouldn't be such requirement (not 100% sure > > though), but I guess from kernel's perspective, "the implementation detail of > > the TDX module" is sort of "architectural requirement" > > Sure, but then it needs to be clearly documented so. > > > -- at least Intel arch guys think so I guess. > > Intel "arch" guys? You might look up the meanings of "arch" in a > dictionary. LKML is not twatter. > > > > Technically there is IMO ZERO requirement to do so. > > > > > > 1) The TDX module is global > > > > > > 2) Seam-root and Seam-non-root operation are strictly a LP property. > > > > > > The only architectural prerequisite for using Seam on a LP is that > > > obviously the encryption/decryption mechanics have been initialized > > > on the package to which the LP belongs. > > > > > > I can see why it might be complicated to add/remove an LP after > > > initialization fact, but technically it should be possible. > > > > "kernel soft offline" actually isn't an issue. We can bring down a logical cpu > > after it gets initialized and then bring it up again. > > That's the whole point where this discussion started: _AFTER_ it gets > initialized. > > Which means that, e.g. adding "nosmt" to the kernel command line will > make TDX fail hard because at the point where TDX is initialized the > hyperthreads are not 'soft' online and cannot respond to anything, but > the BIOS already accounted them. > > This is just wrong as we all know that "nosmt" is sadly one of the > obvious counter measures for the never ending flood of speculation > issues. Agree. As said in my other replies, I'll follow up with TDX module guys on this. > > > Only add/removal of physical cpu will cause problem: > > You wish. > > > TDX MCHECK verifies all boot-time present cpus to make sure they are TDX- > > compatible before it enables TDX in hardware. MCHECK cannot run on hot-added > > CPU, so TDX cannot support physical CPU hotplug. > > TDX can rightfully impose the limitation that it only executes on CPUs, > which are known at boot/initialization time, and only utilizes "known" > memory. That's it, but that does not enforce to prevent physical hotplug > in general. Adding physical CPUs should have no problem I guess, they just cannot run TDX. TDX architecture doesn't expect they can run TDX code anyway. But would physical removal of boot-time present CPU cause problem? TDX MCHECK checks/records boot-time present CPUs. If a CPU is removed and then a new one is added, then TDX still treats it is TDX-compatible, but it may actually not. So if this happens, from functionality's point of view, it can break. I think TDX still wants to guarantee TDX code can work correctly on "TDX recorded" CPUs. Also, I am not sure whether there's any security issue if a malicious kernel tries to run TDX code on such removed-then-added CPU. This seems a TDX architecture problem rather than kernel policy issue. > > > We tried to get it clarified in the specification, and below is what TDX/module > > arch guys agreed to put to the TDX module spec (just checked it's not in latest > > public spec yet, but they said it will be in next release): > > > > " > > 4.1.3.2. CPU Configuration > > > > During platform boot, MCHECK verifies all logical CPUs to ensure they > > meet TDX’s > > That MCHECK falls into the category of security voodoo. > > It needs to verify _ALL_ logical CPUs to ensure that Intel did not put > different models and steppings into a package or what? I am guessing so. > > > security and certain functionality requirements, and MCHECK passes the following > > CPU configuration information to the NP-SEAMLDR, P-SEAMLDR and the TDX Module: > > > > · Total number of logical processors in the platform. > > You surely need MCHECK for this > > > · Total number of installed packages in the platform. > > and for this... > > > · A table of per-package CPU family, model and stepping etc. > > identification, as enumerated by CPUID(1).EAX. > > The above information is static and does not change after platform boot and > > MCHECK run. > > > > Note: TDX doesn’t support adding or removing CPUs from TDX security > > perimeter, as checked my MCHECK. > > More security voodoo. The TDX security perimeter has nothing to do with > adding or removing CPUs on a system. If that'd be true then TDX is a > complete fail. No argument here. > > BIOS should prevent CPUs from being hot-added or hot-removed after > > platform boots. > > If the BIOS does not prevent it, then TDX and the Seam module will not > even notice unless the OS tries to invoke seamcall() on a newly plugged > CPU. > > A newly added CPU and newly added memory should not have any impact on > the TDX integrity of the already running system. If they have then > again, TDX is broken by design. No argument here either. > > > The TDX module performs additional checks of the CPU’s configuration and > > supported features, by reading MSRs and CPUID information as described in the > > following sections. > > to ensure that the MCHECK generated table is still valid at the point > where TDX is initialized? I think it is trying to say: MCHECK doesn't do all the verifications. Some verfications are deferred to the TDX module to check when it gets initialized. > > That said, this documentation is at least better than the existing void, > but that does not make it technically correct. > > Thanks, > > tglx