Re: [PATCH v3 00/21] TDX host kernel support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-05-04 at 07:31 -0700, Dan Williams wrote:
> On Tue, May 3, 2022 at 4:59 PM Kai Huang <kai.huang@xxxxxxxxx> wrote:
> > 
> > On Fri, 2022-04-29 at 13:40 +1200, Kai Huang wrote:
> > > On Thu, 2022-04-28 at 12:58 +1200, Kai Huang wrote:
> > > > On Wed, 2022-04-27 at 17:50 -0700, Dave Hansen wrote:
> > > > > On 4/27/22 17:37, Kai Huang wrote:
> > > > > > On Wed, 2022-04-27 at 14:59 -0700, Dave Hansen wrote:
> > > > > > > In 5 years, if someone takes this code and runs it on Intel hardware
> > > > > > > with memory hotplug, CPU hotplug, NVDIMMs *AND* TDX support, what happens?
> > > > > > 
> > > > > > I thought we could document this in the documentation saying that this code can
> > > > > > only work on TDX machines that don't have above capabilities (SPR for now).  We
> > > > > > can change the code and the documentation  when we add the support of those
> > > > > > features in the future, and update the documentation.
> > > > > > 
> > > > > > If 5 years later someone takes this code, he/she should take a look at the
> > > > > > documentation and figure out that he/she should choose a newer kernel if the
> > > > > > machine support those features.
> > > > > > 
> > > > > > I'll think about design solutions if above doesn't look good for you.
> > > > > 
> > > > > No, it doesn't look good to me.
> > > > > 
> > > > > You can't just say:
> > > > > 
> > > > >   /*
> > > > >    * This code will eat puppies if used on systems with hotplug.
> > > > >    */
> > > > > 
> > > > > and merrily await the puppy bloodbath.
> > > > > 
> > > > > If it's not compatible, then you have to *MAKE* it not compatible in a
> > > > > safe, controlled way.
> > > > > 
> > > > > > > You can't just ignore the problems because they're not present on one
> > > > > > > version of the hardware.
> > > > > 
> > > > > Please, please read this again ^^
> > > > 
> > > > OK.  I'll think about solutions and come back later.
> > > > > 
> > > 
> > > Hi Dave,
> > > 
> > > I think we have two approaches to handle memory hotplug interaction with the TDX
> > > module initialization.
> > > 
> > > The first approach is simple.  We just block memory from being added as system
> > > RAM managed by page allocator when the platform supports TDX [1]. It seems we
> > > can add some arch-specific-check to __add_memory_resource() and reject the new
> > > memory resource if platform supports TDX.  __add_memory_resource() is called by
> > > both __add_memory() and add_memory_driver_managed() so it prevents from adding
> > > NVDIMM as system RAM and normal ACPI memory hotplug [2].
> > 
> > Hi Dave,
> > 
> > Try to close how to handle memory hotplug.  Any comments to below?
> > 
> > For the first approach, I forgot to think about memory hot-remove case.  If we
> > just reject adding new memory resource when TDX is capable on the platform, then
> > if the memory is hot-removed, we won't be able to add it back.  My thinking is
> > we still want to support memory online/offline because it is purely in software
> > but has nothing to do with TDX.  But if one memory resource can be put to
> > offline, it seems we don't have any way to prevent it from being removed.
> > 
> > So if we do above, on the future platforms when memory hotplug can co-exist with
> > TDX, ACPI hot-add and kmem-hot-add memory will be prevented.  However if some
> > memory is hot-removed, it won't be able to be added back (even it is included in
> > CMR, or TDMRs after TDX module is initialized).
> > 
> > Is this behavior acceptable?  Or perhaps I have misunderstanding?
> 
> Memory online at boot uses similar kernel paths as memory-online at
> run time, so it sounds like your question is confusing physical vs
> logical remove. Consider the case of logical offline then re-online
> where the proposed TDX sanity check blocks the memory online, but then
> a new kernel is kexec'd and that kernel again trusts the memory as TD
> convertible again just because it onlines the memory in the boot path.
> For physical memory remove it seems up to the platform to block that
> if it conflicts with TDX, not for the kernel to add extra assumptions
> that logical offline / online is incompatible with TDX.

Hi Dan,

No we don't block memory online, but we block memory add.  The code I mentioned
is add_memory_resource(), while memory online code path is
memory_block_online().  Or am I wrong?

-- 
Thanks,
-Kai





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux