On Thu, Jul 06, 2023, Dave Hansen wrote: > On 7/5/23 07:57, Peter Zijlstra wrote: > > On Wed, Jul 05, 2023 at 07:34:06AM -0700, Dave Hansen wrote: > >> On 7/4/23 09:58, Peter Zijlstra wrote: > >>> If we have concerns about allocating the PAMT array, can't we use CMA > >>> for this? Allocate the whole thing at boot as CMA such that when not > >>> used for TDX it can be used for regular things like userspace and > >>> filecache pages? > >> I never thought of CMA as being super reliable. Maybe it's improved > >> over the years. > >> > >> KVM also has a rather nasty habit of pinning pages, like for device > >> passthrough. I suspect that means that we'll have one of two scenarios: > >> > >> 1. CMA works great, but the TDX/CMA area is unusable for KVM because > >> it's pinning all its pages and they just get moved out of the CMA > >> area immediately. The CMA area is effectively wasted. > >> 2. CMA sucks, and users get sporadic TDX failures when they wait a long > >> time to run a TDX guest after boot. Users just work around the CMA > >> support by starting up TDX guests at boot or demanding a module > >> parameter be set. Hacking in CMA support was a waste. > >> > >> Am I just too much of a pessimist? > > Well, if CMA still sucks, then that needs fixing. If CMA works, but we > > have a circular fail in that KVM needs to long-term pin the PAMT pages > > but long-term pin is evicted from CMA (the whole point of long-term pin, > > after all), then surely we can break that cycle somehow, since in this > > case the purpose of the CMA is being able to grab that memory chunk when > > we needs it. > > > > That is, either way around is just a matter of a little code, no? > > It's not a circular dependency, it's conflicting requirements. > > CMA makes memory more available, but only in the face of unpinned pages. > > KVM can pin lots of pages, even outside of TDX-based VMs. > > So we either need to change how CMA works fundamentally or stop KVM from > pinning pages. Nit, I think you're conflating KVM with VFIO and/or IOMMU code. Device passhthrough does pin large chunks of memory, but KVM itself isn't involved or even aware of the pins. HugeTLB is another case where CMA will be effectively used to serve guest memory, but again KVM isn't the thing doing the pinning.