On Mon, May 09, 2022 at 11:47:43AM -0700, Dave Hansen wrote: > ... adding some KVM/TDX folks + AMD SEV folks as they're going to probably need something like that too. > On 5/6/22 12:02, Boris Petkov wrote: > >> This node attribute punts the problem back out to userspace. It > >> gives userspace the ability to steer allocations to compatible NUMA > >> nodes. If something goes wrong, they can use other NUMA ABIs to > >> inspect the situation, like /proc/$pid/numa_maps. > > That's all fine and dandy but I still don't see the *actual*, > > real-life use case of why something would request memory of > > particular encryption capabilities. Don't get me wrong - I'm not > > saying there are not such use cases - I'm saying we should go all the > > way and fully define properly *why* we're doing this whole hoopla. > > Let's say TDX is running on a system with mixed encryption > capabilities*. Some NUMA nodes support TDX and some don't. If that > happens, your guest RAM can come from anywhere. When the host kernel > calls into the TDX module to add pages to the guest (via > TDH.MEM.PAGE.ADD) it might get an error back from the TDX module. At > that point, the host kernel is stuck. It's got a partially created > guest and no recourse to fix the error. Thanks for that detailed use case, btw! > This new ABI provides a way to avoid that situation in the first place. > Userspace can look at sysfs to figure out which NUMA nodes support > "encryption" (aka. TDX) and can use the existing NUMA policy ABI to > avoid TDH.MEM.PAGE.ADD failures. > > So, here's the question for the TDX folks: are these mixed-capability > systems a problem for you? Does this ABI help you fix the problem? What I'm not really sure too is, is per-node granularity ok? I guess it is but let me ask it anyway... > Will your userspace (qemu and friends) actually use consume from this ABI? Same question for SEV folks - do you guys think this interface would make sense for the SEV side of things? > * There are three ways we might hit a system with this issue: > 1. NVDIMMs that don't support TDX, like lack of memory integrity > protection. > 2. CXL-attached memory controllers that can't do encryption at all > 3. Nominally TDX-compatible memory that was not covered/converted by > the kernel for some reason (memory hot-add, or ran out of TDMR > resources) And I think some of those might be of interest to the AMD side of things too. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette