On 20.11.24 18:21, Nikita Kalyazin wrote:
On 20/11/2024 16:44, David Hildenbrand wrote:
If the problem is the "pagecache" overhead, then yes, it will be a
harder nut to crack. But maybe there are some low-hanging fruits to
optimize? Finding the main cause for the added overhead would be
interesting.
Agreed, knowing the exact root cause would be really nice.
Can you compare uffdio_copy() when using anonymous memory vs. shmem?
That's likely the best we could currently achieve with guest_memfd.
Yeah, I was doing that too. It was about ~28% slower in my setup, while
with guest_memfd it was ~34% slower.
I looked into uffdio_copy() for shmem and we still walk+modify page
tables. In theory, we could try hacking that out: for filling the
pagecache we would only need the vma properties, not the page table
properties; that would then really resemble "only modify the pagecache".
That would likely resemble what we would expect with guest_memfd: work
only on the pagecache and not the page tables. So it's rather surprising
that guest_memfd is slower than that, as it currently doesn't mess with
user page tables at all.
The variance of the data was quite
high so the difference may well be just noise. In other words, I'd be
much happier if we could bring guest_memfd (or even shmem) performance
closer to the anon/private than if we just equalised guest_memfd with
shmem (which are probably already pretty close).
Makes sense. Best we can do is:
anon: work only on page tables
shmem/guest_memfd: work only on pageacache
So at least "only one treelike structure to update".
--
Cheers,
David / dhildenb