Re: [PATCH v10 08/16] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 3, 2021 at 4:38 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote:
>
> From: Brijesh Singh <brijesh.singh@xxxxxxx>
>
> This hypercall is used by the SEV guest to notify a change in the page
> encryption status to the hypervisor. The hypercall should be invoked
> only when the encryption attribute is changed from encrypted -> decrypted
> and vice versa. By default all guest pages are considered encrypted.
>
> The patch introduces a new shared pages list implemented as a
> sorted linked list to track the shared/unencrypted regions marked by the
> guest hypercall.
>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: "Radim Krčmář" <rkrcmar@xxxxxxxxxx>
> Cc: Joerg Roedel <joro@xxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxx>
> Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
> Cc: x86@xxxxxxxxxx
> Cc: kvm@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx>
> Co-developed-by: Ashish Kalra <ashish.kalra@xxxxxxx>
> Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
> ---
>  Documentation/virt/kvm/hypercalls.rst |  15 +++
>  arch/x86/include/asm/kvm_host.h       |   2 +
>  arch/x86/kvm/svm/sev.c                | 150 ++++++++++++++++++++++++++
>  arch/x86/kvm/svm/svm.c                |   2 +
>  arch/x86/kvm/svm/svm.h                |   5 +
>  arch/x86/kvm/vmx/vmx.c                |   1 +
>  arch/x86/kvm/x86.c                    |   6 ++
>  include/uapi/linux/kvm_para.h         |   1 +
>  8 files changed, 182 insertions(+)
>
> diff --git a/Documentation/virt/kvm/hypercalls.rst b/Documentation/virt/kvm/hypercalls.rst
> index ed4fddd364ea..7aff0cebab7c 100644
> --- a/Documentation/virt/kvm/hypercalls.rst
> +++ b/Documentation/virt/kvm/hypercalls.rst
> @@ -169,3 +169,18 @@ a0: destination APIC ID
>
>  :Usage example: When sending a call-function IPI-many to vCPUs, yield if
>                 any of the IPI target vCPUs was preempted.
> +
> +
> +8. KVM_HC_PAGE_ENC_STATUS
> +-------------------------
> +:Architecture: x86
> +:Status: active
> +:Purpose: Notify the encryption status changes in guest page table (SEV guest)
> +
> +a0: the guest physical address of the start page
> +a1: the number of pages
> +a2: encryption attribute
> +
> +   Where:
> +       * 1: Encryption attribute is set
> +       * 0: Encryption attribute is cleared
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3d6616f6f6ef..2da5f5e2a10e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1301,6 +1301,8 @@ struct kvm_x86_ops {
>         int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
>
>         void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
> +       int (*page_enc_status_hc)(struct kvm *kvm, unsigned long gpa,
> +                                 unsigned long sz, unsigned long mode);
>  };
>
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 25eaf35ba51d..55c628df5155 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -45,6 +45,11 @@ struct enc_region {
>         unsigned long size;
>  };
>
> +struct shared_region {
> +       struct list_head list;
> +       unsigned long gfn_start, gfn_end;
> +};
> +
>  static int sev_flush_asids(void)
>  {
>         int ret, error = 0;
> @@ -196,6 +201,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         sev->active = true;
>         sev->asid = asid;
>         INIT_LIST_HEAD(&sev->regions_list);
> +       INIT_LIST_HEAD(&sev->shared_pages_list);
> +       sev->shared_pages_list_count = 0;
>
>         return 0;
>
> @@ -1473,6 +1480,148 @@ static int sev_receive_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         return ret;
>  }
>
> +static int remove_shared_region(unsigned long start, unsigned long end,
> +                               struct list_head *head)
> +{
> +       struct shared_region *pos;
> +
> +       list_for_each_entry(pos, head, list) {
> +               if (pos->gfn_start == start &&
> +                   pos->gfn_end == end) {
> +                       list_del(&pos->list);
> +                       kfree(pos);
> +                       return -1;
> +               } else if (start >= pos->gfn_start && end <= pos->gfn_end) {
> +                       if (start == pos->gfn_start)
> +                               pos->gfn_start = end + 1;
> +                       else if (end == pos->gfn_end)
> +                               pos->gfn_end = start - 1;
> +                       else {
> +                               /* Do a de-merge -- split linked list nodes */
> +                               unsigned long tmp;
> +                               struct shared_region *shrd_region;
> +
> +                               tmp = pos->gfn_end;
> +                               pos->gfn_end = start-1;
> +                               shrd_region = kzalloc(sizeof(*shrd_region), GFP_KERNEL_ACCOUNT);
> +                               if (!shrd_region)
> +                                       return -ENOMEM;
> +                               shrd_region->gfn_start = end + 1;
> +                               shrd_region->gfn_end = tmp;
> +                               list_add(&shrd_region->list, &pos->list);
> +                               return 1;
> +                       }
> +                       return 0;
> +               }
> +       }

This doesn't handle the case where the region being marked as
encrypted is larger than than the unencrypted region under
consideration, which (I believe) can happen with the current kexec
handling (since it is oblivious to the prior state).
I would probably break this down into the "five cases": no
intersection (skip), entry is completely contained (drop), entry
completely contains the removed region (split), intersect start
(chop), and intersect end (chop).

>
> +       return 0;
> +}
> +
> +static int add_shared_region(unsigned long start, unsigned long end,
> +                            struct list_head *shared_pages_list)
> +{
> +       struct list_head *head = shared_pages_list;
> +       struct shared_region *shrd_region;
> +       struct shared_region *pos;
> +
> +       if (list_empty(head)) {
> +               shrd_region = kzalloc(sizeof(*shrd_region), GFP_KERNEL_ACCOUNT);
> +               if (!shrd_region)
> +                       return -ENOMEM;
> +               shrd_region->gfn_start = start;
> +               shrd_region->gfn_end = end;
> +               list_add_tail(&shrd_region->list, head);
> +               return 1;
> +       }
> +
> +       /*
> +        * Shared pages list is a sorted list in ascending order of
> +        * guest PA's and also merges consecutive range of guest PA's
> +        */
> +       list_for_each_entry(pos, head, list) {
> +               if (pos->gfn_end < start)
> +                       continue;
> +               /* merge consecutive guest PA(s) */
> +               if (pos->gfn_start <= start && pos->gfn_end >= start) {
> +                       pos->gfn_end = end;

I'm not sure this correctly avoids having duplicate overlapping
elements in the list. It also doesn't merge consecutive contiguous
regions. Current guest implementation should never call the hypercall
with C=0 for the same region twice, without calling with c=1 in
between, but this API should be compatible with that model.

The easiest pattern would probably be to:
1) find (or insert) the node that will contain the added region.
2) remove the contents of the added region from the tail (will
typically do nothing).
3) merge the head of the tail into the current node, if the end of the
current node matches the start of that head.
>
> +                       return 0;
> +               }
> +               break;
> +       }
>
> +       /*
> +        * Add a new node, allocate nodes using GFP_KERNEL_ACCOUNT so that
> +        * kernel memory can be tracked/throttled in case a
> +        * malicious guest makes infinite number of hypercalls to
> +        * exhaust host kernel memory and cause a DOS attack.
> +        */
> +       shrd_region = kzalloc(sizeof(*shrd_region), GFP_KERNEL_ACCOUNT);
> +       if (!shrd_region)
> +               return -ENOMEM;
> +       shrd_region->gfn_start = start;
> +       shrd_region->gfn_end = end;
> +       list_add_tail(&shrd_region->list, &pos->list);
> +       return 1;
>
> +}
> +

Thanks!
Steve




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux