On Tue, Jul 28, 2020 at 04:37:38PM +0200, Vitaly Kuznetsov wrote: > This is a continuation of "[PATCH RFC 0/5] KVM: x86: KVM_MEM_ALLONES > memory" work: > https://lore.kernel.org/kvm/20200514180540.52407-1-vkuznets@xxxxxxxxxx/ > and pairs with Julia's "x86/PCI: Use MMCONFIG by default for KVM guests": > https://lore.kernel.org/linux-pci/20200722001513.298315-1-jusual@xxxxxxxxxx/ > > PCIe config space can (depending on the configuration) be quite big but > usually is sparsely populated. Guest may scan it by accessing individual > device's page which, when device is missing, is supposed to have 'pci > hole' semantics: reads return '0xff' and writes get discarded. > > When testing Linux kernel boot with QEMU q35 VM and direct kernel boot > I observed 8193 accesses to PCI hole memory. When such exit is handled > in KVM without exiting to userspace, it takes roughly 0.000001 sec. > Handling the same exit in userspace is six times slower (0.000006 sec) so > the overal; difference is 0.04 sec. This may be significant for 'microvm' > ideas. > > Note, the same speed can already be achieved by using KVM_MEM_READONLY > but doing this would require allocating real memory for all missing > devices and e.g. 8192 pages gives us 32mb. This will have to be allocated > for each guest separately and for 'microvm' use-cases this is likely > a no-go. > > Introduce special KVM_MEM_PCI_HOLE memory: userspace doesn't need to > back it with real memory, all reads from it are handled inside KVM and > return '0xff'. Writes still go to userspace but these should be extremely > rare. > > The original 'KVM_MEM_ALLONES' idea had additional optimizations: KVM > was mapping all 'PCI hole' pages to a single read-only page stuffed with > 0xff. This is omitted in this submission as the benefits are unclear: > KVM will have to allocate SPTEs (either on demand or aggressively) and > this also consumes time/memory. Curious about this: if we do it aggressively on the 1st fault, how long does it take to allocate 256 huge page SPTEs? And the amount of memory seems pretty small then, right? > We can always take a look at possible > optimizations later. > > Vitaly Kuznetsov (3): > KVM: x86: move kvm_vcpu_gfn_to_memslot() out of try_async_pf() > KVM: x86: introduce KVM_MEM_PCI_HOLE memory > KVM: selftests: add KVM_MEM_PCI_HOLE test > > Documentation/virt/kvm/api.rst | 19 ++- > arch/x86/include/uapi/asm/kvm.h | 1 + > arch/x86/kvm/mmu/mmu.c | 19 +-- > arch/x86/kvm/mmu/paging_tmpl.h | 10 +- > arch/x86/kvm/x86.c | 10 +- > include/linux/kvm_host.h | 7 +- > include/uapi/linux/kvm.h | 3 +- > tools/testing/selftests/kvm/Makefile | 1 + > .../testing/selftests/kvm/include/kvm_util.h | 1 + > tools/testing/selftests/kvm/lib/kvm_util.c | 81 +++++++------ > .../kvm/x86_64/memory_slot_pci_hole.c | 112 ++++++++++++++++++ > virt/kvm/kvm_main.c | 39 ++++-- > 12 files changed, 243 insertions(+), 60 deletions(-) > create mode 100644 tools/testing/selftests/kvm/x86_64/memory_slot_pci_hole.c > > -- > 2.25.4