Re: [PATCH v3 00/22] Improve scalability of KVM + userfaultfd live migration via annotated memory faults.

Anish Moorthy <amoorthy@xxxxxxxxxx> · Tue, 9 May 2023 13:52:05 -0700

On Sun, May 7, 2023 at 6:23 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> I explained why I think it could be useful to test this in my reply to
> Nadav, do you think it makes sense to you?

Ah, I actually missed your reply to Nadav: didn't realize you had sent
*two* emails.

> While OTOH if multi-uffd can scale well, then there's a chance of
> general solution as long as we can remove the single-queue
> contention over the whole guest mem.

I don't quite understand your statement here: if we pursue multi-uffd,
then it seems to me that by definition we've removed the single
queue(s) for all of guest memory, and thus the associated contention.
And we'd still have the issue of multiple vCPUs contending for a
single UFFD.

But I do share some of your curiosity about multi-uffd performance,
especially since some of my earlier numbers indicated that multi-uffd
doesn't scale linearly, even when each vCPU corresponds to a single
UFFD.

So, I grabbed some more profiles for 32 and 64 vcpus using the following command
./demand_paging_test -b 512M -u MINOR -s shmem -v <n> -r 1 -c <1,...,n>

The 32-vcpu config achieves a per-vcpu paging rate of 8.8k. That rate
goes down to 3.9k (!) with 64 vCPUs. I don't immediately see the issue
from the traces, but safe to say it's definitely not scaling. Since I
applied your fixes from earlier, the prefaulting isn't being counted
against the demand paging rate either.

32-vcpu profile:
https://drive.google.com/file/d/19ZZDxZArhSsbW_5u5VcmLT48osHlO9TG/view?usp=drivesdk
64-vcpu profile:
https://drive.google.com/file/d/1dyLOLVHRNdkUoFFr7gxqtoSZGn1_GqmS/view?usp=drivesdk

Do let me know if you need svg files instead and I'll try and figure that out.