Re: [PATCH v8 07/16] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory

"Huang, Kai" <kai.huang@xxxxxxxxx> · Thu, 12 Jan 2023 11:33:06 +0000

On Mon, 2023-01-09 at 08:51 -0800, Dave Hansen wrote:
> > > > > > +       list_for_each_entry(tmb, &tdx_memlist, list) {
> > > > > > +               /*
> > > > > > +                * The new range is TDX memory if it is fully
> > > > > > covered by
> > > > > > +                * any TDX memory block.
> > > > > > +                *
> > > > > > +                * Note TDX memory blocks are originated from
> > > > > > memblock
> > > > > > +                * memory regions, which can only be contiguous when
> > > > > > two
> > > > > > +                * regions have different NUMA nodes or flags. 
> > > > > > Therefore
> > > > > > +                * the new range cannot cross multiple TDX memory
> > > > > > blocks.
> > > > > > +                */
> > > > > > +               if (start_pfn >= tmb->start_pfn && end_pfn <= tmb-
> > > > > > >end_pfn)
> > > > > > +                       return true;
> > > > > > +       }
> > > > > > +       return false;
> > > > > > +}
> > > > 
> > > > I don't really like that comment.  It should first state its behavior
> > > > and assumptions, like:
> > > > 
> > > >      This check assumes that the start_pfn<->end_pfn range does not
> > > >      cross multiple tdx_memlist entries.
> > > > 
> > > > Only then should it describe why that is OK:
> > > > 
> > > >      A single memory hotplug even across mutliple memblocks (from
> > > >      which tdx_memlist entries are derived) is impossible.  ... then
> > > >      actually explain
> > > > 
> > 
> > How about below?
> > 
> >          /*
> >           * This check assumes that the start_pfn<->end_pfn range does not
> > cross
> >           * multiple tdx_memlist entries. A single memory hotplug event
> > across
> >           * multiple memblocks (from which tdx_memlist entries are derived)
> > is
> >           * impossible. That means start_pfn<->end_pfn range cannot exceed a
> >           * tdx_memlist entry, and the new range is TDX memory if it is
> > fully
> >           * covered by any tdx_memlist entry.
> >           */
> 
> I was hoping you would actually explain why it is impossible.
> 
> Is there something fundamental that keeps a memory area that spans two
> nodes from being removed and then a new area added that is comprised of
> a single node?
> 
> Boot time:
> 
> 	| memblock  |  memblock |
> 	<--Node=0--> <--Node=1-->
> 
> Funky hotplug... nothing to see here, then:
> 
> 	<--------Node=2-------->
> 
> I would believe that there is no current bare-metal TDX system that has
> an implementation like this.  But, the comments above speak like it's
> fundamentally impossible.  That should be clarified.
> 
> In other words, that comment talks about memblock attributes as being
> the core underlying reason that that simplified check is OK.  Is that
> it, or is it really the reduced hotplug feature set on TDX systems?

Hi Dave,

I think I have been forgetting that we have switched to reject non-TDX memory in
memory online, but not in memory hot-add.  

Memory offline/online is done on granularity of 'struct memory_block', but not
memblock.  In fact, the hotpluggable memory region (one memblock) must be
multiple of memory_block, and a "to-be-online" memory_block must be full range
memory (no memory hole).

So if I am not missing something, IIUC that means if the start_pfn<->end_pfn is
TDX memory, it must be fully within some @tdx_memlist entry, but cannot cross
multiple small entries.  And the memory hotplug case in your above diagram
actually shouldn't matter.

If above stands, how about below?

        /*
         * This check assumes that the start_pfn<->end_pfn range does not 
         * cross multiple @tdx_memlist entries.  A single memory online   
         * event across multiple @tdx_memlist entries (which are derived  
         * from memblocks at the time of module initialization) is not    
         * possible.
         *
         * This is because memory offline/online is done on granularity   
         * of 'struct memory_block', and the hotpluggable memory region   
         * (one memblock) must be multiple of memory_block.  Also, the    
         * "to-be-online" memory_block must be full memory (no memory     
         * hole, i.e. containing multiple small memblocks).
         *
         * This means if the start_pfn<->end_pfn range is TDX memory, it  
         * must be fully within one @tdx_memlist entry, but cannot cross  
         * multiple small entries.
         */
        list_for_each_entry(tmb, &tdx_memlist, list) {
                if (start_pfn >= tmb->start_pfn && end_pfn <= tmb->end_pfn)
                        return true;
        }