On 30/08/24 14:52, Huang, Kai wrote: > On Fri, 2024-08-30 at 13:50 +0300, Hunter, Adrian wrote: >> On 27/08/24 10:14, Kai Huang wrote: >>> A TDX module initialization failure was reported on a Emerald Rapids >>> platform: >>> >>> virt/tdx: initialization failed: TDMR [0x0, 0x80000000): reserved areas exhausted. >>> virt/tdx: module initialization failed (-28) >>> >>> As part of initializing the TDX module, the kernel informs the TDX >>> module of all "TDX-usable memory regions" using an array of TDX defined >>> structure "TD Memory Region" (TDMR). Each TDMR must be in 1GB aligned >>> and in 1GB granularity, and all "non-TDX-usable memory holes" within a >>> given TDMR must be marked as "reserved areas". The TDX module reports a >>> maximum number of reserved areas that can be supported per TDMR. >> >> The statement: >> >> ... all "non-TDX-usable memory holes" within a >> given TDMR must be marked as "reserved areas". >> >> is not exactly true, which is essentially the basis of this fix. > > Hmm I think I see what you mean. Perhaps the "must be marked as" confuses > you? > > The "TDX-usable memory" here means all pages that can potentially be used by > TDX. They don't have to be actually used by TDX, i.e., "TDX-usable memory" vs > "TDX-used memory". > > And the "non-TDX-usable memory holes" means the memory regions that cannot be > possibly used by TDX at all. > > Is below better if I change "must be marked as" to "are"? > > As part of initializing the TDX module, the kernel informs the TDX > module of all "TDX-usable memory regions" using an array of TDX defined > structure "TD Memory Region" (TDMR). Each TDMR must be in 1GB aligned > and in 1GB granularity, and all "non-TDX-usable memory holes" within a > given TDMR are marked as "reserved areas". The TDX module reports a > maximum number of reserved areas that can be supported per TDMR. Yes, that's better. > > Note in my logic here we don't need to mention CMR. Here I just want to tell > the fact that each TDMR has number of "reserved areas" and the maximum number > is reported by TDX module. > >> >> The relevant requirements are (from the spec): >> >> Any non-reserved 4KB page within a TDMR must be convertible >> i.e., it must be within a CMR > > Yes. > >> >> Reserved areas within a TDMR need not be within a CMR. > > Yes. They need not to be, but they can be. > >> >> PAMT areas must not overlap with TDMR non-reserved areas; >> however, they may reside within TDMR reserved areas >> (as long as these are convertible). > > Yes. However in implementation PAMTs are out of page allocator so they are > all within TDMRs thus need to be put to reserved areas. > > Those are TDX architectural requirements. They are not all related to the fix > of this problem. The most important thing here is: > > Any non-reserved memory within a TDMR must be within CMR. > > That means as long as one memory region is CMR, it doesn't need to be in > "reserved area" from TDX architecture's prespective. > > That means we can include more memory regions (even they cannot be used by TDX > at all) as "non-reserved" areas in TDMRs to reduce the number of "reserved > areas" as long as those regions are within CMR. > > This is the logic behind this fix. > >> >>> >>> Currently, the kernel finds those "non-TDX-usable memory holes" within a >>> given TDMR by walking over a list of "TDX-usable memory regions", which >>> essentially reflects the "usable" regions in the e820 table (w/o memory >>> hotplug operations precisely, but this is not relevant here). >> >> But including e820 table regions that are not "usable" in the TDMR >> reserved areas is not necessary - it is not one of the rules. > > True. That's why we can do this fix. > >> >> What confused me initially was that I did not realize the we already >> require that the TDX Module does not touch memory in the TDMR >> non-reserved areas not specifically allocated to it. So it makes no >> difference to the TDX Module what the pages that have not been allocated >> to it, are used for. >> >>> >>> As shown above, the root cause of this failure is when the kernel tries >>> to construct a TDMR to cover address range [0x0, 0x80000000), there >>> are too many memory holes within that range and the number of memory >>> holes exceeds the maximum number of reserved areas. >>> >>> The E820 table of that platform (see [1] below) reflects this: the >>> number of memory holes among e820 "usable" entries exceeds 16, which is >>> the maximum number of reserved areas TDX module supports in practice. >>> >>> === Fix === >>> >>> There are two options to fix this: 1) reduce the number of memory holes >>> when constructing a TDMR to save "reserved areas"; 2) reduce the TDMR's >>> size to cover fewer memory regions, thus fewer memory holes. >> >> Probably better to try and get rid of this "two options" stuff and focus >> on how this is a simple and effective fix. > > As I mentioned in another reply I would prefer to keep those options since I > believe they can provide a full view to the reviewers.