Re: [PATCH v2 14/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-12-08 at 15:21 +0000, Jarkko Sakkinen wrote:
> On Fri, Dec 02, 2022 at 10:36:50AM -0800, Kristen Carlson Accardi
> wrote:
> > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> > 
> > Introduce the OOM path for killing an enclave with the reclaimer
> > is no longer able to reclaim enough EPC pages. Find a victim
> > enclave,
> > which will be an enclave with EPC pages remaining that are not
> > accessible to the reclaimer ("unreclaimable"). Once a victim is
> > identified, mark the enclave as OOM and zap the enclaves entire
> > page range. Release all the enclaves resources except for the
> > struct sgx_encl memory itself.
> > 
> > Signed-off-by: Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx>
> > Signed-off-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
> > Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
> 
> Why this patch is dependent of all 13 patches before it?
> 
> Looks like something that is orthogonal to cgroups and could be
> live by its own. At least it probably does not require all of
> those patches, or does it?
> 
> Even without cgroups it would make sense to killing enclaves if
> reclaimer gets stuck.
> 
> BR, Jarkko

It is dependent first of all of having the LRU struct with the
unreclaimable/reclaimable lists. Which means it requires storing the
enclave pointer in the page as well. It's dependent on knowing how many
pages are available, being able to ignore the age of a page etc. Right
now, without cgroups, sgx will be unable to allocate memory when an
enclave is created if it cannot reclaim enough memory from the existing
in use enclaves.

Aside from that though, I don't think that killing enclaves makes sense
outside the context of cgroup limits. Without cgroup limits, you have a
max number of EPC pages that you can have active at any one time. If an
enclave attempts to allocate a new page and the reclaimer can't free up
any, how would you decide whether it's ok to kill an entire enclave in
order to grant this other enclave the higher priority for getting a
page? With a cgroup limit, the system owner explicitly can decide what
the limits on usage will be, but without that, you'd have a situation
where one new enclave could kill others I would think. Better to just
have it the way it is - new page allocations fail if there are not free
pages, but you don't kill enclaves that already exist.





[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux