Re: [PATCH 1/6] KVM: Cache the least recently used slot index per vCPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 31/07/21 00:37, David Matlack wrote:
The memslot for a given gfn is looked up multiple times during page
fault handling. Avoid binary searching for it multiple times by caching
the least recently used slot. There is an existing VM-wide LRU slot but
that does not work well for cases where vCPUs are accessing memory in
different slots (see performance data below).

Another benefit of caching the least recently use slot (versus looking
up the slot once and passing around a pointer) is speeding up memslot
lookups *across* faults and during spte prefetching.

To measure the performance of this change I ran dirty_log_perf_test with
64 vCPUs and 64 memslots and measured "Populate memory time" and
"Iteration 2 dirty memory time".  Tests were ran with eptad=N to force
dirty logging to use fast_page_fault so its performance could be
measured.

Config     | Metric                        | Before | After
---------- | ----------------------------- | ------ | ------
tdp_mmu=Y  | Populate memory time          | 6.76s  | 5.47s
tdp_mmu=Y  | Iteration 2 dirty memory time | 2.83s  | 0.31s
tdp_mmu=N  | Populate memory time          | 20.4s  | 18.7s
tdp_mmu=N  | Iteration 2 dirty memory time | 2.65s  | 0.30s

The "Iteration 2 dirty memory time" results are especially compelling
because they are equivalent to running the same test with a single
memslot. In other words, fast_page_fault performance no longer scales
with the number of memslots.

Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx>

It's the *most* recently used slot index, of course. :) That's true of lru_slot as well.

+static inline struct kvm_memory_slot *get_slot(struct kvm_memslots *slots, int slot_index)
+{
+	if (slot_index < 0 || slot_index >= slots->used_slots)
+		return NULL;
+
+	return &slots->memslots[slot_index];
+}
+

Since there are plenty of arrays inside struct kvm_memory_slot*, do we want to protect this against speculative out-of-bounds accesses with array_index_nospec?

+static inline struct kvm_memory_slot *
+search_memslots(struct kvm_memslots *slots, gfn_t gfn)
+{
+	int slot_index = __search_memslots(slots, gfn);
+
+	return get_slot(slots, slot_index);
 }

Let's use this occasion to remove the duplication between __gfn_to_memslot and search_memslots; you can make search_memslots do the search and palce the LRU (ehm, MRU) code to __gfn_to_memslot only. So:

- the new patch 1 (something like "make search_memslots search without LRU caching") is basically this series's patch 2, plus a tree-wide replacement of search_memslots with __gfn_to_memslot.

- the new patch 2 is this series's patch 1, except kvm_vcpu_gfn_to_memslot uses search_memslots instead of __search_memslots. The comments in patch 2's commit message about the double misses move to this commit message.

+	if (slot)
+		vcpu->lru_slot_index = slot_index;

Let's call it lru_slot for consistency with the field of struct kvm_memory_slots.

Paolo




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux