On Fri, Sep 18, 2020 at 05:09:19PM -0700, Sean Christopherson wrote: > On Fri, Sep 18, 2020 at 03:39:32PM +0300, Jarkko Sakkinen wrote: > > On Fri, Sep 18, 2020 at 03:20:39PM +0300, Jarkko Sakkinen wrote: > > > On Thu, Sep 17, 2020 at 07:09:40PM -0700, Sean Christopherson wrote: > > > > On Thu, Sep 17, 2020 at 01:35:10PM -0500, Haitao Huang wrote: > > > > > On Thu, 17 Sep 2020 11:02:06 -0500, Jarkko Sakkinen > > > > > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > Right, I do get the OOM case but wouldn't in that case the reasonable > > > > > > thing to do destroy the enclave that is not even running? I mean that > > > > > > means that we are globally out of EPC. > > > > > > > > > > > > > > > > I would say it could be a policy, but not the only one. If it does not make > > > > > much difference to kernel, IMHO we should not set it in stone now. > > > > > Debugging is also huge benefit to me. > > > > > > > > Agreed, an EPC cgroup is the proper way to define/enforce what happens when > > > > there is EPC pressure. E.g. if process A is consuming 99% of the EPC, then > > > > it doesn't make sense to unconditionally kill enclaves from process B. If > > > > the admin wants to give process A priority, so be it, but such a decision > > > > shouldn't be baked into the kernel. > > > > > > > > This series obviously doesn't provide an EPC cgroup, but that doesn't mean > > > > we can't make decisions that will play nice with a cgroup in the future. > > > > > > Here's the core issue why the API "as is used to be" does not work: > > > > > > if (ret == -EIO) { > > > mutex_lock(&encl->lock); > > > sgx_encl_destroy(encl); > > > mutex_unlock(&encl->lock); > > > } > > > > > > It would be better to instead whitelist *when* the enclave is preserved. > > > > > > if (ret != -ENOMEM) { > > > mutex_lock(&encl->lock); > > > sgx_encl_destroy(encl); > > > mutex_unlock(&encl->lock); > > > } > > > > > > That is the information we *deterministically* want to know. Otherwise, > > > we will live in ultimate chaos. > > > > > > Only this way can caller know when there are means to continue, and when > > > to quit. I.e. the code is whitelisting wrong way around currently. > > > > Actually since the state cannot corrupt unless EADD or EEXTEND fails it > > is fine to have the enclave alive on any other error condition. I think > > EADD and EEXTEND failure don't corrupt state. Killing the enclave if EEXTEND > fails makes sense because failure at that point is either due to a kernel bug > or loss of EPC, both of which are fatal to the enclave. This is also true. I meant by corrupt state e.g. a kernel bug, which causes uninitalizes pages go the free queue. I'd rephrase this in kdoc as: "The function deinitializes enclave and returns -EIO when EPC is lost, while entering to a new power cycle". Documentation describes only legit behaviour, let's ignore the corrupt part. > EADD is a little different, e.g. it could fault due to a bad source address, > in which case the failure is not technically fatal. But, Jarkko wanted to > have consistent behavior for EADD and EEXTEND failures, and practically > speaking the enclave is probably hosed anyways if EADD fails, i.e. killing the > enclave on EADD failure isn't a sticking point (for me). We need to figure out own return value for EADD, but I agree with this. I would go with -EFAULT as we do when source VMA is no available. Does this make sense to you? /Jarkko