On Wed, Aug 29, 2018 at 01:58:09PM -0700, Huang, Kai wrote: > > -----Original Message----- > > From: Christopherson, Sean J > > Sent: Thursday, August 30, 2018 8:34 AM > > To: Huang, Kai <kai.huang@xxxxxxxxx> > > Cc: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx>; platform-driver- > > x86@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; nhorman@xxxxxxxxxx; linux- > > kernel@xxxxxxxxxxxxxxx; tglx@xxxxxxxxxxxxx; suresh.b.siddha@xxxxxxxxx; Ayoun, > > Serge <serge.ayoun@xxxxxxxxx>; hpa@xxxxxxxxx; npmccallum@xxxxxxxxxx; > > mingo@xxxxxxxxxx; linux-sgx@xxxxxxxxxxxxxxx; Hansen, Dave > > <dave.hansen@xxxxxxxxx> > > Subject: Re: [PATCH v13 10/13] x86/sgx: Add sgx_einit() for initializing enclaves > > > > On Wed, Aug 29, 2018 at 12:33:54AM -0700, Huang, Kai wrote: > > > [snip..] > > > > > > > > > > > > > > > @@ -38,6 +39,18 @@ static LIST_HEAD(sgx_active_page_list); > > > > > > static DEFINE_SPINLOCK(sgx_active_page_list_lock); > > > > > > static struct task_struct *ksgxswapd_tsk; static > > > > > > DECLARE_WAIT_QUEUE_HEAD(ksgxswapd_waitq); > > > > > > +static struct notifier_block sgx_pm_notifier; static u64 > > > > > > +sgx_pm_cnt; > > > > > > + > > > > > > +/* The cache for the last known values of IA32_SGXLEPUBKEYHASHx > > > > > > +MSRs > > > > > > for each > > > > > > + * CPU. The entries are initialized when they are first used by > > > > > > sgx_einit(). > > > > > > + */ > > > > > > +struct sgx_lepubkeyhash { > > > > > > + u64 msrs[4]; > > > > > > + u64 pm_cnt; > > > > > > > > > > May I ask why do we need pm_cnt here? In fact why do we need > > > > > suspend staff (namely, sgx_pm_cnt above, and related code in this > > > > > patch) here in this patch? From the patch commit message I don't > > > > > see why we need PM staff here. Please give comment why you need PM > > > > > staff, or you may consider to split the PM staff to another patch. > > > > > > > > Refining the commit message probably makes more sense because > > > > without PM code sgx_einit() would be broken. The MSRs have been reset > > after waking up. > > > > > > > > Some kind of counter is required to keep track of the power cycle. > > > > When going to sleep the sgx_pm_cnt is increased. sgx_einit() > > > > compares the current value of the global count to the value in the > > > > cache entry to see whether we are in a new power cycle. > > > > > > You mean reset to Intel default? I think we can also just reset the > > > cached MSR values on each power cycle, which would be simpler, IMHO? > > > > Refresh my brain, does hardware reset the MSRs on a transition to S3 or lower? > > > > > I think we definitely need some code to handle S3-S5, but should be in > > > separate patches, since I think the major impact of S3-S5 is entire > > > EPC being destroyed. I think keeping pm_cnt is not sufficient enough > > > to handle such case? > > > > > > > > This brings up one question though: how do we deal with VM host going to > > sleep? > > > > VM guest would not be aware of this. > > > > > > IMO VM just gets "sudden loss of EPC" after suspend & resume in host. > > > SGX driver and SDK should be able to handle "sudden loss of EPC", ie, > > > co-working together to re-establish the missing enclaves. > > > > > > Actually supporting "sudden loss of EPC" is a requirement to support > > > live migration of VM w/ SGX. Internally long time ago we had a > > > discussion and the decision was we should support SGX live migration given > > two facts: > > > > > > 1) losing platform-dependent is not important. For example, losing > > > sealing key is not a problem, as we could get secrets provisioned > > > again from remote. 2) Both windows & linux driver commit to support "sudden > > loss of EPC". > > > > > > I don't think we have to support in very first upstream driver, but I > > > think we need to support someday. > > > > Actually, we can easily support this in the driver, at least for SGX1 hardware. > > That's my guess too. Just want to check whether we are still on the same page :) > > > SGX2 isn't difficult to handle, but we've intentionally postponed those patches > > until SGX1 support is in mainline[1]. > > Accesses to the EPC after it is lost will cause faults. Userspace EPC accesses, e.g. > > ERESUME, will get a SIGSEGV that the process should interpret as an "I should > > restart my enclave" event. The SDK already does this. In the driver, we just need > > to be aware of this potential behavior and not freak out. Specifically, SGX_INVD > > needs to not WARN on faults that may have been due to a the EPC being nuked. > > I think we can even remove the sgx_encl_pm_notifier() code altogether. > > Possibly we still need to do some cleanup, ie, all structures of enclaves, upon resume? Not for functional reasons. The driver will automatically do the cleanup via SGX_INVD when it next accesses the enclave's pages and takes a fault, e.g. during reclaim. Proactively reclaiming the EPC pages would probably affect performance, though not necessarily in a good way. And I think it would be a beneficial to get the driver out of the suspend/hibernate/resume paths, e.g. zapping all enclaves could noticeably impact suspend/resume latency. > Anyway I am just guessing. > > Thanks, > -Kai > > > > > [1] SGX1 hardware signals a #GP on an access to an invalid EPC page. > > SGX2 signals a #PF with the PF_SGX error code bit set. This is > > problematic because the kernel looks at the PTEs for CR2 and sees > > nothing wrong, so it thinks it should just restart the > > instruction, leading to an infinite fault loop. Resolving this > > is fairly straightforward, but a complete fix requires propagating > > PF_SGX down to the ENCLS fixup handler, which means plumbing the > > error code through the fixup handlers or smushing PF_SGX into > > trapnr. Since there is a parallel effort to plumb the error code > > through the handlers, https://lkml.org/lkml/2018/8/6/924, we opted > > to do this in a separate series. > > > > > Sean, > > > > > > Would you be able to comment here? > > > > > > > > > > > I think the best measure would be to add a new parameter to > > > > sgx_einit() that enforces update of the MSRs. The driver can then > > > > set this parameter in the case when sgx_einit() returns > > > > SGX_INVALID_LICENSE. This is coherent because the driver requires > > > > writable MSRs. It would not be coherent to do it directly in the core because > > KVM does not require writable MSRs. > > > > > > IMHO this is not required, as I mentioned above. > > > > > > And > > > [snip...] > > > > > > Thanks, > > > -Kai