On Thu, 2021-11-04 at 06:50 -0700, Dave Hansen wrote: > On 11/3/21 4:22 PM, Jarkko Sakkinen wrote: > > The backing memory was not freed, after loading it back to the SGX > > reserved memory. This problem was not prevalent with the systems that > > were available at the time when SGX was first upstreamed, as the swapped > > memory could grow only up to 128 MB. However, Icelake Server can have > > gigabytes of SGX memory, and thus this has become a real bottleneck. > > > > Free the swap space for other use by calling shmem_truncate_range(), > > when a page is faulted back. > > This needs a _bit_ more context. Could you include a few sentences > about what backing memory is? > > It's also not a big deal but it is nice to include something like: > > This was found by inspection. > > and: > > Reported-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Oops, I'm sorry, I was planning to add this but forgot to do it. > Also, what is the end-user-visible effect of this bug? What would a > user see if they were impacted by this? How did you determine that this > fixed the issue? This memory will be recovered when the enclave is torn > down, right? You're absolutely right. I just realized how to make the whole thing concrete and reproduce OOM with the 128 MB EPC that I have in my i5-9600KF desktop. I'll just reconfigure my test VM to have sufficiently small amount of RAM, and come up with an appropriate workload. > Do we also need to deal with truncating the PCMD? (For those watching > along at home, there are two things SGX swaps to RAM: the actual page > data and also some metadata that ensures page integrity and helps > prevent things like rolling back to old versions of swapped pages) Yes. This can be achieved by iterating through all of the enclave pages, which share the same shmem page for storing their PCMD's, as the one being faulted back. If none of those pages is swapped, the PCMD page can safely truncated. I guess it makes sense to put this into patch of its own. /Jarkko