On Tue, Jun 6, 2017 at 1:52 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx> wrote: > > > On 5/18/2017 7:45 PM, Huang, Kai wrote: >> >> >> >> On 5/17/2017 12:09 PM, Andy Lutomirski wrote: >>> >>> On Mon, May 15, 2017 at 5:48 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx> >>> wrote: >>>> >>>> >>>> >>>> On 5/12/2017 6:11 PM, Andy Lutomirski wrote: >>>>> >>>>> >>>>> On Thu, May 11, 2017 at 9:56 PM, Huang, Kai <kai.huang@xxxxxxxxxxxxxxx> >>>>> wrote: >>>>>> >>>>>> >>>>>> I am not sure whether the cost of writing to 4 MSRs would be >>>>>> *extremely* >>>>>> slow, as when vcpu is schedule in, KVM is already doing vmcs_load, >>>>>> writing >>>>>> to several MSRs, etc. >>>>> >>>>> >>>>> >>>>> I'm speculating that these MSRs may be rather unoptimized and hence >>>>> unusualy slow. >>>>> >>>>>> >>>>>>> >>>>>>> Have a percpu variable that stores the current SGXLEPUBKEYHASH along >>>>>>> with whatever lock is needed (probably just a mutex). Users of EINIT >>>>>>> will take the mutex, compare the percpu variable to the desired >>>>>>> value, >>>>>>> and, if it's different, do WRMSR and update the percpu variable. >>>>>>> >>>>>>> KVM will implement writes to SGXLEPUBKEYHASH by updating its >>>>>>> in-memory >>>>>>> state but *not* changing the MSRs. KVM will trap and emulate EINIT >>>>>>> to >>>>>>> support the same handling as the host. There is no action required >>>>>>> at >>>>>>> all on KVM guest entry and exit. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> This is doable, but SGX driver needs to do those things and expose >>>>>> interfaces for KVM to use. In terms of the percpu data, it is nice to >>>>>> have, >>>>>> but I am not sure whether it is mandatory, as IMO EINIT is not even in >>>>>> performance critical path. We can simply read old value from MSRs out >>>>>> and >>>>>> compare whether the old equals to the new. >>>>> >>>>> >>>>> >>>>> I think the SGX driver should probably live in arch/x86, and the >>>>> interface could be a simple percpu variable that is exported (from the >>>>> main kernel image, not from a module). >>>>> >>>>>> >>>>>>> >>>>>>> FWIW, I think that KVM will, in the long run, want to trap EINIT for >>>>>>> other reasons: someone is going to want to implement policy for what >>>>>>> enclaves are allowed that applies to guests as well as the host. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I am not very convinced why "what enclaves are allowed" in host would >>>>>> apply >>>>>> to guest. Can you elaborate? I mean in general virtualization just >>>>>> focus >>>>>> emulating hardware behavior. If a native machine is able to run any >>>>>> LE, >>>>>> the >>>>>> virtual machine should be able to as well (of course, with guest's >>>>>> IA32_FEATURE_CONTROL[bit 17] set). >>>>> >>>>> >>>>> >>>>> I strongly disagree. I can imagine two classes of sensible policies >>>>> for launch control: >>>>> >>>>> 1. Allow everything. This seems quite sensible to me. >>>>> >>>>> 2. Allow some things, and make sure that VMs have at least as >>>>> restrictive a policy as host root has. After all, what's the point of >>>>> restricting enclaves in the host if host code can simply spawn a >>>>> little VM to run otherwise-disallowed enclaves? >>>> >>>> >>>> >>>> What's the current SGX driver launch control policy? Yes allow >>>> everything >>>> works for KVM so lets skip this. Are we going to support allowing >>>> several >>>> LEs, or just allowing one single LE? I know Jarkko is doing in-kernel LE >>>> staff but I don't know details. >>>> >>>> I am trying to find a way that we can both not break host launch control >>>> policy, and be consistent to HW behavior (from guest's view). Currently >>>> we >>>> can create a KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn >>>> either >>>> enabled or disabled. I introduced an Qemu parameter 'lewr' for this >>>> purpose. >>>> Actually I introduced below Qemu SGX parameters for creating guest: >>>> >>>> -sgx epc=<size>,lehash='SHA-256 hash',lewr >>>> >>>> where 'epc' specifies guest's EPC size, lehash specifies (initial) value >>>> of >>>> guest's IA32_SGXLEPUBKEYHASHn, and 'lewr' specifies whether guest is >>>> allowed >>>> to change guest's IA32_SGXLEPUBKEYHASHn at runtime. >>>> >>>> If host only allows one single LE to run, KVM can add a restrict that >>>> only >>>> allows to create KVM guest with runtime change to IA32_SGXLEPUBKEYHASHn >>>> disabled, so that only host allowed (single) hash can be used by guest. >>>> From >>>> guest's view, it simply has IA32_FEATURE_CONTROL[bit17] cleared and has >>>> IA32_SGXLEPUBKEYHASHn with default value to be host allowed (single) >>>> hash. >>>> >>>> If host allows several LEs (not but everything), and if we create guest >>>> with >>>> 'lewr', then the behavior is not consistent with HW behavior, as from >>>> guest's hardware's point of view, we can actually run any LE but we have >>>> to >>>> tell guest that you are only allowed to change IA32_SGXLEPUBKEYHASHn to >>>> some >>>> specific values. One compromise solution is we don't allow to create >>>> guest >>>> with 'lewr' specified, and at the meantime, only allow to create guest >>>> with >>>> host approved hashes specified in 'lehash'. This will make guest's >>>> behavior >>>> consistent to HW behavior but only allows guest to run one LE (which is >>>> specified by 'lehash' when guest is created). >>> >>> >>> I'm not sure I entirely agree for a couple reasons. >>> >>> 1. I wouldn't be surprised if the kernel ends up implementing a policy >>> in which it checks all enclaves (not just LEs) for acceptability. In >>> fact, if the kernel sticks with the "no LE at all or just >>> kernel-internal LE", then checking enclaves directly against some >>> admin- or distro-provided signer list seems reasonable. This type of >>> policy can't be forwarded to a guest by restricting allowed LE >>> signers. But this is mostly speculation since AFAIK no one has >>> seriously proposed any particular policy support and the plan was to >>> not have this for the initial implementation. >>> >>> 2. While matching hardware behavior is nice in principle, there >>> doesn't seem to be useful hardware behavior to match here. If the >>> host had a list of five allowed LE signers, how exactly would it >>> restrict the MSRs? They're not written atomically, so you can't >>> directly tell what's being written. >> >> >> In this case I actually plan to just allow creating guest with guest's >> IA32_SGXLEPUBKEYHASHn disabled (without 'lewr' specified). If 'lewr' is >> specified, creating guest will fail. And we only allow creating guest with >> host allowed hash values (with 'lehash=hash-value'), and if 'hash-value' >> specified by 'lehash' is not allowed by host, we also fail to create guest. >> >> We can only allow creating guest with 'lewr' specified when host allows >> anything. >> >> But in this way, we are restricting guest OS's ability to run LE, as only >> one LE, that is specified by 'lehash' parameter, can be run. But I think >> this won't hurt much, as multiple guests still are able to run different >> LEs? >> >> Also, the only way to fail an MSR >>> >>> write is to send #GP, and Windows (and even Linux) may not expect >>> that. Linux doesn't panic due to #GP on MSR writes these days, but >>> you still get a big fat warning. I wouldn't be at all surprised if >>> Windows BSODs. >> >> >> We cannot allow writing some particular value to MSRs successfully, while >> injecting #GP when writing other values to the same MSRs. So #GP is not >> option. >> >> ENCLS[EINIT], on the other hand, returns an actual >>> >>> error code. I'm not sure that a sensible error code exists >>> ("SGX_HYPERVISOR_SAID_NO?", perhaps), >> >> >> Looks no such error code exists. And we cannot return such error code to >> guest as such error code is only supposed to be valid when ENCLS is run in >> hypervisor. >> >> but SGX_INVALID_EINITTOKEN seems >>> >>> to mean, more or less, "the CPU thinks you're not authorized to do >>> this", so forcing that error code could be entirely reasonable. >>> >>> If the host policy is to allow a list of LE signers, you could return >>> SGX_INVALID_EINITTOKEN if the guest tries to EINIT an LE that isn't in >>> the list. >> >> >> But this would be inconsistent with HW behavior. If the hash value in >> guest's IA32_SGXLEPUBKEYHASHn is matched with the one passed by EINIT, EINIT >> is not supposed to return SGX_INVALID_EINITTOKEN. >> >> I think from VMM's perspective, emulating HW behavior to be consistent >> with real HW behavior is very important. >> >> Paolo, would you provide your comments? > > > Hi all, > > This has been quite for a while and I'd like to start discussion again. > Jarkko told me that currently he only supports one LE in SGX driver, but I > am not sure whether he is going to extend in the future or not. I think this > might also depend on requirements from customers. > > Andy, > > If we only support one LE in driver, then we can only support the same LE > for all KVM guests, according to your comments that host kernel launch > control policy should also apply to KVM guests? WOuld you comments more? I'm not at all convinced that, going forward, Linux's host-side launch control policy will be entirely contained in the LE. I'm also not convinced that non-Linux guests will function at all under this type of policy -- what if FreeBSD's LE is different for whatever reason? > > Jarkko, > > Could you help to clarify the whole launch control policy in host side so > that we can have a better understanding together? > > Thanks, > -Kai > >> >>> >>> --Andy >>> >