[RFC PATCH part-7 00/12] Memory protection based on page state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch set is part-7 of this RFC patches. It introduces memory
protection based on page state management for pKVM on Intel platform,
and enable running normal VM based on it.

Take use of ignored bits in EPT page table entry [1] to record page
state and owner id of a page:

 63 ... 58 |   57  56   |    ...    |  31 ... 12 | 11 ... 0
 ---------------------------------------------------------
 |  ...    | page state |    ...    | [owner_id] |    ...

Page state - bits[57,56]:
- PKVM_NOPAGE(00b):
	the page has no mapping in page table.
	under this page state, host EPT is using the pte ignored
	bits[31~12] to record owner_id.
- PKVM_PAGE_OWNED(01b):
	the page is owned exclusively by the page-table owner.
- PKVM_PAGE_SHARED_OWNED(10b):
	the page is owned by the page-table owner, but is shared
	with another.
- PKVM_PAGE_SHARED_BORROWED(11b):
	the page is shared with, but not owned by the page-table
	owner.

Owner_id - bits[31~12] (only valid in host EPT when PKVM_NOPAGE):
- 0: 	  PKVM_ID_HYP
- 1:      PKVM_ID_HOST
- others: PKVM_ID_GUEST

Below state machine defines how page states are transformed among
different entities (host EPT, and guest shadow EPT - which include
normal VM & protected VM):

                                            +------------------+
     +------------------+                   |  (Init state)    |
     |  host : NOPAGE   | <---------------- |  host* : OWNED   |
     |  guestA*: OWNED  | ----------------> |  guest: NOPAGE   |
     +------------------+       /           +------------------+
           |        ^          /                 |        ^
           |        |         /                  |        |
           |        |        /                   |        |
           |        |       /                    |        |
           |        |      /                     |        |
           v        |     /                      v        |
  +----------------------------+         +----------------------------+
  |   host : SHARED_BORROWED   |         |   host : SHARED_OWNED      |
  |   guestA: SHARED_OWNED     |         |   guestB*: SHARED_BORROWED |
  +----------------------------+         +----------------------------+

 [*] host:   EPT of host VM
 [*] guestA: shadow EPT of a protected VM
 [*] guestB: shadow EPT of a normal VM

Initially, all pages except pKVM owned ones are owned by host VM, so
these pages are marked with PKVM_PAGE_OWNED in host EPT. Meantime,
before guest first EPT_VIOLATION, there is no page mapped in guest
shadow EPT, so all page states in its shadow EPT are PKVM_NOPAGE.

When guest EPT_VIOLATION happen, pKVM needs to do EPT shadowing to
build shadow EPT page mapping based on virtual EPT. During it, the
corresponding page's state shall follow above state machine to do page
donation or page sharing.

- page donation

  For a protected VM (guestA), during EPT shadowing, the page assigned
  to guestA shall be donated from host VM. Which means the page's
  ownership is moved from host to guestA. So in host EPT, the mapping
  of corresponding page table entry (host_gpa to hpa(== host_gpa)) is
  removed and its page state is marked as PKVM_NOPAGE (meantime the
  guestA is recorded as owner_id). Meanwhile in guestA shadow EPT, the
  mapping of corresponding page table entry (gpa to hpa) is setup and
  its page state is marked as PKVM_PAGE_OWNED.

  Once a page is donated to a guest, it cannot be donated or shared to
  other guests before undonate back to host.

  Sometimes, host also need donate pages to the pKVM hypervisor (e.g.,
  when creating a VM, its shadow VM data strtucture is allocated in host
  then donated to the pKVM hypervisor).

  This patch set coveres page donation from host to the pKVM hypervisor,
  but does not include page dontation from host to a protected VM - it's
  essential to run a protected VM.

- page sharing

  For a normal VM (guestB), during EPT shadowing, the page assigned to
  guestB shall be shared from host VM. Which means both host VM and
  guestB can access this page. So in host EPT, the mapping of
  corresponding page table entry is kept and its page state is marked
  as PKVM_PAGE_SHARED_OWNED. Meanwhile in guestB shadow EPT, the mapping
  of corresponding page table entry is setup and its page state is marked
  as PKVM_PAGE_SHARED_BORROWED.

  Once a page is shared to a guest, it cannot be donated or shared to
  other guests before unshare back to host.

  For a protected VM (guestA), a page can be shared back to host VM after
  donated to this guest (e.g., to support virtio). For this case, in host
  EPT, the mapping of corresponding page table entry is setup again and
  its page state is marked as PKVM_PAGE_SHARED_BORROWED. Meanwhile in
  guestA shadow EPT, the mapping of corresponding page table entry is
  kept but its page state is changed to PKVM_PAGE_SHARED_OWNED.

  Once a page is shared back to host after donated, guestA is allowed to
  unshare it. And this page can also be returned back to host directly.

  [Note: above page sharing from a protected VM is not covered in the RFC]

Based on above, this patch set support page state APIs:

- __pkvm_host_donate_hyp/__pkvm_hyp_donate_host
  help to donate/undonate shadow VM/VCPU structure from host to pKVM
- __pkvm_host_share_guest/__pkvm_host_unshare_guest
  help to manage page state of a normal VM's memory, which in the future
  disallow protected VMs to allocate pages under such shared page state.

[1]: SDM: The Extended Page Table Mechanism (EPT) chapter

Jason Chen CJ (2):
  pkvm: x86: Add pgtable override helper functions for map/unmap/free
    leaf
  pkvm: x86: Use page state API in shadow EPT for normal VM

Shaoqin Huang (10):
  pkvm: x86: Introduce pkvm_pgtable_annotate
  pkvm: x86: Use host EPT to track page ownership
  pkvm: x86: Use SW bits to track page state
  pkvm: x86: Add the record of the page state into page table entry
  pkvm: x86: Expose host EPT lock
  pkvm: x86: Implement do_donate() helper for donating memory
  pkvm: x86: Implement __pkvm_hyp_donate_host()
  pkvm: x86: Donate shadow vm & vcpu pages to hypervisor
  pkvm: x86: Implement do_share() helper for sharing memory
  pkvm: x86: Implement do_unshare() helper for unsharing memory

 arch/x86/kvm/Kconfig                      |   1 +
 arch/x86/kvm/vmx/pkvm/hyp/Makefile        |   2 +-
 arch/x86/kvm/vmx/pkvm/hyp/ept.c           |  84 ++-
 arch/x86/kvm/vmx/pkvm/hyp/ept.h           |   3 +
 arch/x86/kvm/vmx/pkvm/hyp/init_finalise.c |   6 +-
 arch/x86/kvm/vmx/pkvm/hyp/mem_protect.c   | 593 ++++++++++++++++++++++
 arch/x86/kvm/vmx/pkvm/hyp/mem_protect.h   | 118 +++++
 arch/x86/kvm/vmx/pkvm/hyp/mmu.c           |   5 +-
 arch/x86/kvm/vmx/pkvm/hyp/pgtable.c       | 222 +++++---
 arch/x86/kvm/vmx/pkvm/hyp/pgtable.h       |  49 +-
 arch/x86/kvm/vmx/pkvm/hyp/pkvm.c          |  61 ++-
 arch/x86/kvm/vmx/pkvm/hyp/pkvm_hyp.h      |   6 +
 12 files changed, 1061 insertions(+), 89 deletions(-)
 create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/mem_protect.c
 create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/mem_protect.h

-- 
2.25.1




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux