Re: [RFC PATCH 0/4] KVM: ioctl for populating guest_memfd

David Hildenbrand <david@xxxxxxxxxx> · Wed, 20 Nov 2024 19:29:17 +0100

On 20.11.24 18:21, Nikita Kalyazin wrote:

On 20/11/2024 16:44, David Hildenbrand wrote:
If the problem is the "pagecache" overhead, then yes, it will be a
harder nut to crack. But maybe there are some low-hanging fruits to
optimize? Finding the main cause for the added overhead would be
interesting.

Agreed, knowing the exact root cause would be really nice.

Can you compare uffdio_copy() when using anonymous memory vs. shmem?
That's likely the best we could currently achieve with guest_memfd.

Yeah, I was doing that too. It was about ~28% slower in my setup, while
with guest_memfd it was ~34% slower. 

I looked into uffdio_copy() for shmem and we still walk+modify page 
tables. In theory, we could try hacking that out: for filling the 
pagecache we would only need the vma properties, not the page table 
properties; that would then really resemble "only modify the pagecache".

That would likely resemble what we would expect with guest_memfd: work 
only on the pagecache and not the page tables. So it's rather surprising 
that guest_memfd is slower than that, as it currently doesn't mess with 
user page tables at all.

 The variance of the data was quite
high so the difference may well be just noise.  In other words, I'd be
much happier if we could bring guest_memfd (or even shmem) performance
closer to the anon/private than if we just equalised guest_memfd with
shmem (which are probably already pretty close).

Makes sense. Best we can do is:

anon: work only on page tables
shmem/guest_memfd: work only on pageacache

So at least "only one treelike structure to update".

--
Cheers,

David / dhildenb