On 6/6/2017 6:52 PM, Huang, Kai wrote: >On 6/7/2017 9:22 AM, Andy Lutomirski wrote: >> On Tue, Jun 6, 2017 at 1:52 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx> wrote: >>> >>> >>> On 5/18/2017 7:45 PM, Huang, Kai wrote: >>>> >>>> >>>> >>>> On 5/17/2017 12:09 PM, Andy Lutomirski wrote: >>>>> >>>>> On Mon, May 15, 2017 at 5:48 PM, Huang, Kai >>>>> <kai.huang@xxxxxxxxxxxxxxx> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 5/12/2017 6:11 PM, Andy Lutomirski wrote: >>>>>>> >>>>>>> >>>>>>> On Thu, May 11, 2017 at 9:56 PM, Huang, Kai >>>>>>> <kai.huang@xxxxxxxxxxxxxxx> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> I am not sure whether the cost of writing to 4 MSRs would be >>>>>>>> *extremely* >>>>>>>> slow, as when vcpu is schedule in, KVM is already doing >>>>>>>> vmcs_load, writing to several MSRs, etc. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm speculating that these MSRs may be rather unoptimized and hence >>>>>>> unusualy slow. >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Have a percpu variable that stores the current SGXLEPUBKEYHASH >along >>>>>>>>> with whatever lock is needed (probably just a mutex). Users of EINIT >>>>>>>>> will take the mutex, compare the percpu variable to the desired >>>>>>>>> value, >>>>>>>>> and, if it's different, do WRMSR and update the percpu variable. >>>>>>>>> >>>>>>>>> KVM will implement writes to SGXLEPUBKEYHASH by updating its >>>>>>>>> in-memory >>>>>>>>> state but *not* changing the MSRs. KVM will trap and emulate EINIT >>>>>>>>> to >>>>>>>>> support the same handling as the host. There is no action required >>>>>>>>> at >>>>>>>>> all on KVM guest entry and exit. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This is doable, but SGX driver needs to do those things and expose >>>>>>>> interfaces for KVM to use. In terms of the percpu data, it is nice to >>>>>>>> have, >>>>>>>> but I am not sure whether it is mandatory, as IMO EINIT is not even in >>>>>>>> performance critical path. We can simply read old value from MSRs out >>>>>>>> and >>>>>>>> compare whether the old equals to the new. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I think the SGX driver should probably live in arch/x86, and the >>>>>>> interface could be a simple percpu variable that is exported (from the >>>>>>> main kernel image, not from a module). >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> FWIW, I think that KVM will, in the long run, want to trap EINIT for >>>>>>>>> other reasons: someone is going to want to implement policy for what >>>>>>>>> enclaves are allowed that applies to guests as well as the host. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I am not very convinced why "what enclaves are allowed" in host would >>>>>>>> apply >>>>>>>> to guest. Can you elaborate? I mean in general virtualization just >>>>>>>> focus >>>>>>>> emulating hardware behavior. If a native machine is able to run any >>>>>>>> LE, >>>>>>>> the >>>>>>>> virtual machine should be able to as well (of course, with guest's >>>>>>>> IA32_FEATURE_CONTROL[bit 17] set). >>>>>>> >>>>>>> >>>>>>> >>>>>>> I strongly disagree. I can imagine two classes of sensible policies >>>>>>> for launch control: >>>>>>> >>>>>>> 1. Allow everything. This seems quite sensible to me. >>>>>>> >>>>>>> 2. Allow some things, and make sure that VMs have at least as >>>>>>> restrictive a policy as host root has. After all, what's the point of >>>>>>> restricting enclaves in the host if host code can simply spawn a >>>>>>> little VM to run otherwise-disallowed enclaves? >>>>>> >>>>>> >>>>>> >>>>>> What's the current SGX driver launch control policy? Yes allow >>>>>> everything >>>>>> works for KVM so lets skip this. Are we going to support allowing >>>>>> several >>>>>> LEs, or just allowing one single LE? I know Jarkko is doing in-kernel LE >>>>>> staff but I don't know details. >>>>>> >>>>>> I am trying to find a way that we can both not break host launch control >>>>>> policy, and be consistent to HW behavior (from guest's view). Currently >>>>>> we >>>>>> can create a KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn >>>>>> either >>>>>> enabled or disabled. I introduced an Qemu parameter 'lewr' for this >>>>>> purpose. >>>>>> Actually I introduced below Qemu SGX parameters for creating guest: >>>>>> >>>>>> -sgx epc=<size>,lehash='SHA-256 hash',lewr >>>>>> >>>>>> where 'epc' specifies guest's EPC size, lehash specifies (initial) value >>>>>> of >>>>>> guest's IA32_SGXLEPUBKEYHASHn, and 'lewr' specifies whether guest is >>>>>> allowed >>>>>> to change guest's IA32_SGXLEPUBKEYHASHn at runtime. >>>>>> >>>>>> If host only allows one single LE to run, KVM can add a restrict that >>>>>> only >>>>>> allows to create KVM guest with runtime change to >IA32_SGXLEPUBKEYHASHn >>>>>> disabled, so that only host allowed (single) hash can be used by guest. >>>>>> From >>>>>> guest's view, it simply has IA32_FEATURE_CONTROL[bit17] cleared and >has >>>>>> IA32_SGXLEPUBKEYHASHn with default value to be host allowed (single) >>>>>> hash. >>>>>> >>>>>> If host allows several LEs (not but everything), and if we create guest >>>>>> with >>>>>> 'lewr', then the behavior is not consistent with HW behavior, as from >>>>>> guest's hardware's point of view, we can actually run any LE but we have >>>>>> to >>>>>> tell guest that you are only allowed to change IA32_SGXLEPUBKEYHASHn >to >>>>>> some >>>>>> specific values. One compromise solution is we don't allow to create >>>>>> guest >>>>>> with 'lewr' specified, and at the meantime, only allow to create guest >>>>>> with >>>>>> host approved hashes specified in 'lehash'. This will make guest's >>>>>> behavior >>>>>> consistent to HW behavior but only allows guest to run one LE (which is >>>>>> specified by 'lehash' when guest is created). >>>>> >>>>> >>>>> I'm not sure I entirely agree for a couple reasons. >>>>> >>>>> 1. I wouldn't be surprised if the kernel ends up implementing a policy >>>>> in which it checks all enclaves (not just LEs) for acceptability. In >>>>> fact, if the kernel sticks with the "no LE at all or just >>>>> kernel-internal LE", then checking enclaves directly against some >>>>> admin- or distro-provided signer list seems reasonable. This type of >>>>> policy can't be forwarded to a guest by restricting allowed LE >>>>> signers. But this is mostly speculation since AFAIK no one has >>>>> seriously proposed any particular policy support and the plan was to >>>>> not have this for the initial implementation. >>>>> >>>>> 2. While matching hardware behavior is nice in principle, there >>>>> doesn't seem to be useful hardware behavior to match here. If the >>>>> host had a list of five allowed LE signers, how exactly would it >>>>> restrict the MSRs? They're not written atomically, so you can't >>>>> directly tell what's being written. >>>> >>>> >>>> In this case I actually plan to just allow creating guest with guest's >>>> IA32_SGXLEPUBKEYHASHn disabled (without 'lewr' specified). If 'lewr' is >>>> specified, creating guest will fail. And we only allow creating guest with >>>> host allowed hash values (with 'lehash=hash-value'), and if 'hash-value' >>>> specified by 'lehash' is not allowed by host, we also fail to create guest. >>>> >>>> We can only allow creating guest with 'lewr' specified when host allows >>>> anything. >>>> >>>> But in this way, we are restricting guest OS's ability to run LE, as only >>>> one LE, that is specified by 'lehash' parameter, can be run. But I think >>>> this won't hurt much, as multiple guests still are able to run different >>>> LEs? >>>> >>>> Also, the only way to fail an MSR >>>>> >>>>> write is to send #GP, and Windows (and even Linux) may not expect >>>>> that. Linux doesn't panic due to #GP on MSR writes these days, but >>>>> you still get a big fat warning. I wouldn't be at all surprised if >>>>> Windows BSODs. >>>> >>>> >>>> We cannot allow writing some particular value to MSRs successfully, while >>>> injecting #GP when writing other values to the same MSRs. So #GP is not >>>> option. >>>> >>>> ENCLS[EINIT], on the other hand, returns an actual >>>>> >>>>> error code. I'm not sure that a sensible error code exists >>>>> ("SGX_HYPERVISOR_SAID_NO?", perhaps), >>>> >>>> >>>> Looks no such error code exists. And we cannot return such error code to >>>> guest as such error code is only supposed to be valid when ENCLS is run in >>>> hypervisor. >>>> >>>> but SGX_INVALID_EINITTOKEN seems >>>>> >>>>> to mean, more or less, "the CPU thinks you're not authorized to do >>>>> this", so forcing that error code could be entirely reasonable. >>>>> >>>>> If the host policy is to allow a list of LE signers, you could return >>>>> SGX_INVALID_EINITTOKEN if the guest tries to EINIT an LE that isn't in >>>>> the list. >>>> >>>> >>>> But this would be inconsistent with HW behavior. If the hash value in >>>> guest's IA32_SGXLEPUBKEYHASHn is matched with the one passed by EINIT, >EINIT >>>> is not supposed to return SGX_INVALID_EINITTOKEN. >>>> >>>> I think from VMM's perspective, emulating HW behavior to be consistent >>>> with real HW behavior is very important. >>>> >>>> Paolo, would you provide your comments? >>> >>> >>> Hi all, >>> >>> This has been quite for a while and I'd like to start discussion again. >>> Jarkko told me that currently he only supports one LE in SGX driver, but I >>> am not sure whether he is going to extend in the future or not. I think this >>> might also depend on requirements from customers. >>> >>> Andy, >>> >>> If we only support one LE in driver, then we can only support the same LE >>> for all KVM guests, according to your comments that host kernel launch >>> control policy should also apply to KVM guests? WOuld you comments more? >> >> I'm not at all convinced that, going forward, Linux's host-side launch >> control policy will be entirely contained in the LE. I'm also not >> convinced that non-Linux guests will function at all under this type >> of policy -- what if FreeBSD's LE is different for whatever reason? > >I am not convinced either. I think we need Jarkko to elaborate how is >host side launch control policy implemented, or is there any policy at >all. I also tried to read SGX driver code but looks I couldn't find any >implementation regarding to this. > >Hi Jarkko, > >Can you elaborate on this? > >Thanks, >-Kai I don't think the kernel's LE policy is relevant here. As long as you allow the guest OS to set the IA32_SGXLEPUBKEYHASHn MSRs, either directly or by the VMM 'lehash' value, you don't need a support for more than on LE in the kernel. The host kernel will have one LE and the guest kernel has another "one" LE that may have a different hash value. I agree with Andy that different OS distributions are likely to have different LE hash values - so the guest OS may require different hash setting. >> >>> >>> Jarkko, >>> >>> Could you help to clarify the whole launch control policy in host side so >>> that we can have a better understanding together? >>> >>> Thanks, >>> -Kai >>> >>>> >>>>> >>>>> --Andy >>>>> >>> >>