On 4/28/22 15:20, Reinette Chatre wrote: > Hi Dave, > > On 4/28/2022 2:30 PM, Dave Hansen wrote: >> On 4/28/22 13:11, Reinette Chatre wrote: > >> Are there any transient, recoverable errors that can come back from >> ELDU? If so, this makes a lot of sense. If not, then it doesn't make a >> lot of sense to preserve the swapped-out content because they enclave is >> going to die anyway. > > Good point. > > Theoretically ELDU could encounter a page fault while accessing the > regions it needs to read from and write to. These faults are passed > through and the instruction would return with a #PF that is > propagated with the page fault handler returning SIGBUS. We don't have to worry about those, though, do we? We're operating entirely on kernel mappings that won't cause #PF. > Even so, this flow also impacts the SGX2 flows that need to load pages from > the backing store. In this case the kernel would pass it as an error > (-EFAULT) to the runtime but it would not result in the > enclave being killed. If it was a #PF that caused the issue then > perhaps theoretically the SGX2 instruction has a chance of succeeding > if the runtime attempts it again? How are the SGX2 flows different than what we have now? I also looked a little deeper at this transient failure problem. The ELDU documentation also mentions a possible error code of: SGX_EPC_PAGE_CONFLICT It *looks* like there can be conflicts on the SECS page as well as the EPC page being explicitly accessed. Is that a possible problem here?