Re: [PATCH v2 1/9] KVM: x86: Add AMD SEV specific Hypercall3

"Kalra, Ashish" <Ashish.Kalra@xxxxxxx> · Tue, 8 Dec 2020 04:16:16 +0000

I don’t think that the bitmap by itself is really a performance bottleneck here.

Thanks,
Ashish

> On Dec 7, 2020, at 9:10 PM, Steve Rutherford <srutherford@xxxxxxxxxx> wrote:
> 
> On Mon, Dec 7, 2020 at 12:42 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>> 
>>> On Sun, Dec 06, 2020, Paolo Bonzini wrote:
>>> On 03/12/20 01:34, Sean Christopherson wrote:
>>>> On Tue, Dec 01, 2020, Ashish Kalra wrote:
>>>>> From: Brijesh Singh <brijesh.singh@xxxxxxx>
>>>>> 
>>>>> KVM hypercall framework relies on alternative framework to patch the
>>>>> VMCALL -> VMMCALL on AMD platform. If a hypercall is made before
>>>>> apply_alternative() is called then it defaults to VMCALL. The approach
>>>>> works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor
>>>>> will be able to decode the instruction and do the right things. But
>>>>> when SEV is active, guest memory is encrypted with guest key and
>>>>> hypervisor will not be able to decode the instruction bytes.
>>>>> 
>>>>> Add SEV specific hypercall3, it unconditionally uses VMMCALL. The hypercall
>>>>> will be used by the SEV guest to notify encrypted pages to the hypervisor.
>>>> 
>>>> What if we invert KVM_HYPERCALL and X86_FEATURE_VMMCALL to default to VMMCALL
>>>> and opt into VMCALL?  It's a synthetic feature flag either way, and I don't
>>>> think there are any existing KVM hypercalls that happen before alternatives are
>>>> patched, i.e. it'll be a nop for sane kernel builds.
>>>> 
>>>> I'm also skeptical that a KVM specific hypercall is the right approach for the
>>>> encryption behavior, but I'll take that up in the patches later in the series.
>>> 
>>> Do you think that it's the guest that should "donate" memory for the bitmap
>>> instead?
>> 
>> No.  Two things I'd like to explore:
>> 
>>  1. Making the hypercall to announce/request private vs. shared common across
>>     hypervisors (KVM, Hyper-V, VMware, etc...) and technologies (SEV-* and TDX).
>>     I'm concerned that we'll end up with multiple hypercalls that do more or
>>     less the same thing, e.g. KVM+SEV, Hyper-V+SEV, TDX, etc...  Maybe it's a
>>     pipe dream, but I'd like to at least explore options before shoving in KVM-
>>     only hypercalls.
>> 
>> 
>>  2. Tracking shared memory via a list of ranges instead of a using bitmap to
>>     track all of guest memory.  For most use cases, the vast majority of guest
>>     memory will be private, most ranges will be 2mb+, and conversions between
>>     private and shared will be uncommon events, i.e. the overhead to walk and
>>     split/merge list entries is hopefully not a big concern.  I suspect a list
>>     would consume far less memory, hopefully without impacting performance.
> 
> For a fancier data structure, I'd suggest an interval tree. Linux
> already has an rbtree-based interval tree implementation, which would
> likely work, and would probably assuage any performance concerns.
> 
> Something like this would not be worth doing unless most of the shared
> pages were physically contiguous. A sample Ubuntu 20.04 VM on GCP had
> 60ish discontiguous shared regions. This is by no means a thorough
> search, but it's suggestive. If this is typical, then the bitmap would
> be far less efficient than most any interval-based data structure.
> 
> You'd have to allow userspace to upper bound the number of intervals
> (similar to the maximum bitmap size), to prevent host OOMs due to
> malicious guests. There's something nice about the guest donating
> memory for this, since that would eliminate the OOM risk.