On Fri, Aug 27, 2021, Andy Lutomirski wrote: > > On Thu, Aug 26, 2021, at 2:26 PM, David Hildenbrand wrote: > > On 26.08.21 19:05, Andy Lutomirski wrote: > > > > Oof. That's quite a requirement. What's the point of the VMA once all > > > this is done? > > > > You can keep using things like mbind(), madvise(), ... and the GUP code > > with a special flag might mostly just do what you want. You won't have > > to reinvent too many wheels on the page fault logic side at least. Ya, Kirill's RFC more or less proved a special GUP flag would indeed Just Work. However, the KVM page fault side of things would require only a handful of small changes to send private memslots down a different path. Compared to the rest of the enabling, it's quite minor. The counter to that is other KVM architectures would need to learn how to use the new APIs, though I suspect that there will be a fair bit of arch enabling regardless of what route we take. > You can keep calling the functions. The implementations working is a > different story: you can't just unmap (pte_numa-style or otherwise) a private > guest page to quiesce it, move it with memcpy(), and then fault it back in. Ya, I brought this up in my earlier reply. Even the initial implementation (without real NUMA support) would likely be painful, e.g. the KVM TDX RFC/PoC adds dedicated logic in KVM to handle the case where NUMA balancing zaps a _pinned_ page and then KVM fault in the same pfn. It's not thaaat ugly, but it's arguably more invasive to KVM's page fault flows than a new fd-based private memslot scheme.