Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's IA32_SGXLEPUBKEYHASHn runtime switch support

Andy Lutomirski <luto@xxxxxxxxxx> · Tue, 6 Jun 2017 14:22:49 -0700

On Tue, Jun 6, 2017 at 1:52 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx> wrote:
>
>
> On 5/18/2017 7:45 PM, Huang, Kai wrote:
>>
>>
>>
>> On 5/17/2017 12:09 PM, Andy Lutomirski wrote:
>>>
>>> On Mon, May 15, 2017 at 5:48 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 5/12/2017 6:11 PM, Andy Lutomirski wrote:
>>>>>
>>>>>
>>>>> On Thu, May 11, 2017 at 9:56 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> I am not sure whether the cost of writing to 4 MSRs would be
>>>>>> *extremely*
>>>>>> slow, as when vcpu is schedule in, KVM is already doing vmcs_load,
>>>>>> writing
>>>>>> to several MSRs, etc.
>>>>>
>>>>>
>>>>>
>>>>> I'm speculating that these MSRs may be rather unoptimized and hence
>>>>> unusualy slow.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Have a percpu variable that stores the current SGXLEPUBKEYHASH along
>>>>>>> with whatever lock is needed (probably just a mutex).  Users of EINIT
>>>>>>> will take the mutex, compare the percpu variable to the desired
>>>>>>> value,
>>>>>>> and, if it's different, do WRMSR and update the percpu variable.
>>>>>>>
>>>>>>> KVM will implement writes to SGXLEPUBKEYHASH by updating its
>>>>>>> in-memory
>>>>>>> state but *not* changing the MSRs.  KVM will trap and emulate EINIT
>>>>>>> to
>>>>>>> support the same handling as the host.  There is no action required
>>>>>>> at
>>>>>>> all on KVM guest entry and exit.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> This is doable, but SGX driver needs to do those things and expose
>>>>>> interfaces for KVM to use. In terms of the percpu data, it is nice to
>>>>>> have,
>>>>>> but I am not sure whether it is mandatory, as IMO EINIT is not even in
>>>>>> performance critical path. We can simply read old value from MSRs out
>>>>>> and
>>>>>> compare whether the old equals to the new.
>>>>>
>>>>>
>>>>>
>>>>> I think the SGX driver should probably live in arch/x86, and the
>>>>> interface could be a simple percpu variable that is exported (from the
>>>>> main kernel image, not from a module).
>>>>>
>>>>>>
>>>>>>>
>>>>>>> FWIW, I think that KVM will, in the long run, want to trap EINIT for
>>>>>>> other reasons: someone is going to want to implement policy for what
>>>>>>> enclaves are allowed that applies to guests as well as the host.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am not very convinced why "what enclaves are allowed" in host would
>>>>>> apply
>>>>>> to guest. Can you elaborate? I mean in general virtualization just
>>>>>> focus
>>>>>> emulating hardware behavior. If a native machine is able to run any
>>>>>> LE,
>>>>>> the
>>>>>> virtual machine should be able to as well (of course, with guest's
>>>>>> IA32_FEATURE_CONTROL[bit 17] set).
>>>>>
>>>>>
>>>>>
>>>>> I strongly disagree.  I can imagine two classes of sensible policies
>>>>> for launch control:
>>>>>
>>>>> 1. Allow everything.  This seems quite sensible to me.
>>>>>
>>>>> 2. Allow some things, and make sure that VMs have at least as
>>>>> restrictive a policy as host root has.  After all, what's the point of
>>>>> restricting enclaves in the host if host code can simply spawn a
>>>>> little VM to run otherwise-disallowed enclaves?
>>>>
>>>>
>>>>
>>>> What's the current SGX driver launch control policy? Yes allow
>>>> everything
>>>> works for KVM so lets skip this. Are we going to support allowing
>>>> several
>>>> LEs, or just allowing one single LE? I know Jarkko is doing in-kernel LE
>>>> staff but I don't know details.
>>>>
>>>> I am trying to find a way that we can both not break host launch control
>>>> policy, and be consistent to HW behavior (from guest's view). Currently
>>>> we
>>>> can create a KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn
>>>> either
>>>> enabled or disabled. I introduced an Qemu parameter 'lewr' for this
>>>> purpose.
>>>> Actually I introduced below Qemu SGX parameters for creating guest:
>>>>
>>>>         -sgx epc=<size>,lehash='SHA-256 hash',lewr
>>>>
>>>> where 'epc' specifies guest's EPC size, lehash specifies (initial) value
>>>> of
>>>> guest's IA32_SGXLEPUBKEYHASHn, and 'lewr' specifies whether guest is
>>>> allowed
>>>> to change guest's IA32_SGXLEPUBKEYHASHn at runtime.
>>>>
>>>> If host only allows one single LE to run, KVM can add a restrict that
>>>> only
>>>> allows to create KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn
>>>> disabled, so that only host allowed (single) hash can be used by guest.
>>>> From
>>>> guest's view, it simply has IA32_FEATURE_CONTROL[bit17] cleared and has
>>>> IA32_SGXLEPUBKEYHASHn with default value to be host allowed (single)
>>>> hash.
>>>>
>>>> If host allows several LEs (not but everything), and if we create guest
>>>> with
>>>> 'lewr', then the behavior is not consistent with HW behavior, as from
>>>> guest's hardware's point of view, we can actually run any LE but we have
>>>> to
>>>> tell guest that you are only allowed to change IA32_SGXLEPUBKEYHASHn to
>>>> some
>>>> specific values. One compromise solution is we don't allow to create
>>>> guest
>>>> with 'lewr' specified, and at the meantime, only allow to create guest
>>>> with
>>>> host approved hashes specified in 'lehash'. This will make guest's
>>>> behavior
>>>> consistent to HW behavior but only allows guest to run one LE (which is
>>>> specified by 'lehash' when guest is created).
>>>
>>>
>>> I'm not sure I entirely agree for a couple reasons.
>>>
>>> 1. I wouldn't be surprised if the kernel ends up implementing a policy
>>> in which it checks all enclaves (not just LEs) for acceptability.  In
>>> fact, if the kernel sticks with the "no LE at all or just
>>> kernel-internal LE", then checking enclaves directly against some
>>> admin- or distro-provided signer list seems reasonable.  This type of
>>> policy can't be forwarded to a guest by restricting allowed LE
>>> signers.  But this is mostly speculation since AFAIK no one has
>>> seriously proposed any particular policy support and the plan was to
>>> not have this for the initial implementation.
>>>
>>> 2. While matching hardware behavior is nice in principle, there
>>> doesn't seem to be useful hardware behavior to match here.  If the
>>> host had a list of five allowed LE signers, how exactly would it
>>> restrict the MSRs?  They're not written atomically, so you can't
>>> directly tell what's being written.
>>
>>
>> In this case I actually plan to just allow creating guest with guest's
>> IA32_SGXLEPUBKEYHASHn disabled (without 'lewr' specified). If 'lewr' is
>> specified, creating guest will fail. And we only allow creating guest with
>> host allowed hash values (with 'lehash=hash-value'), and if 'hash-value'
>> specified by 'lehash' is not allowed by host, we also fail to create guest.
>>
>> We can only allow creating guest with 'lewr' specified when host allows
>> anything.
>>
>> But in this way, we are restricting guest OS's ability to run LE, as only
>> one LE, that is specified by 'lehash' parameter, can be run. But I think
>> this won't hurt much, as multiple guests still are able to run different
>> LEs?
>>
>> Also, the only way to fail an MSR
>>>
>>> write is to send #GP, and Windows (and even Linux) may not expect
>>> that.  Linux doesn't panic due to #GP on MSR writes these days, but
>>> you still get a big fat warning.  I wouldn't be at all surprised if
>>> Windows BSODs.
>>
>>
>> We cannot allow writing some particular value to MSRs successfully, while
>> injecting #GP when writing other values to the same MSRs. So #GP is not
>> option.
>>
>> ENCLS[EINIT], on the other hand, returns an actual
>>>
>>> error code.  I'm not sure that a sensible error code exists
>>> ("SGX_HYPERVISOR_SAID_NO?", perhaps),
>>
>>
>> Looks no such error code exists. And we cannot return such error code to
>> guest as such error code is only supposed to be valid when ENCLS is run in
>> hypervisor.
>>
>> but SGX_INVALID_EINITTOKEN seems
>>>
>>> to mean, more or less, "the CPU thinks you're not authorized to do
>>> this", so forcing that error code could be entirely reasonable.
>>>
>>> If the host policy is to allow a list of LE signers, you could return
>>> SGX_INVALID_EINITTOKEN if the guest tries to EINIT an LE that isn't in
>>> the list.
>>
>>
>> But this would be inconsistent with HW behavior. If the hash value in
>> guest's IA32_SGXLEPUBKEYHASHn is matched with the one passed by EINIT, EINIT
>> is not supposed to return SGX_INVALID_EINITTOKEN.
>>
>> I think from VMM's perspective, emulating HW behavior to be consistent
>> with real HW behavior is very important.
>>
>> Paolo, would you provide your comments?
>
>
> Hi all,
>
> This has been quite for a while and I'd like to start discussion again.
> Jarkko told me that currently he only supports one LE in SGX driver, but I
> am not sure whether he is going to extend in the future or not. I think this
> might also depend on requirements from customers.
>
> Andy,
>
> If we only support one LE in driver, then we can only support the same LE
> for all KVM guests, according to your comments that host kernel launch
> control policy should also apply to KVM guests? WOuld you comments more?

I'm not at all convinced that, going forward, Linux's host-side launch
control policy will be entirely contained in the LE.  I'm also not
convinced that non-Linux guests will function at all under this type
of policy -- what if FreeBSD's LE is different for whatever reason?

>
> Jarkko,
>
> Could you help to clarify the whole launch control policy in host side so
> that we can have a better understanding together?
>
> Thanks,
> -Kai
>
>>
>>>
>>> --Andy
>>>
>