On Thu, 2023-10-12 at 08:27 -0500, Haitao Huang wrote: > On Tue, 10 Oct 2023 19:51:17 -0500, Huang, Kai <kai.huang@xxxxxxxxx> wrote: > [...] > > (btw, even you track VA/SECS pages in unreclaimable list, given they > > both have > > 'enclave' as the owner, do you still need SGX_EPC_OWNER_ENCL and > > SGX_EPC_OWNER_PAGE ?) > > Let me think about it, there might be also a way just track encl objects > not unreclaimable pages. > > I still not get why we need kill the VM not just remove just enough pages. > Is it due to the static allocation not able to reclaim? We can choose to "just remove enough EPC pages". The VM may or may not be killed when it wants the EPC pages back, depending on whether the current EPC cgroup can provide enough EPC pages or not. And this depends on how we implement the cgroup algorithm to reclaim EPC pages. One problem could be: for a EPC cgroup only has SGX VMs, you may end up with moving EPC pages from one VM to another and then vice versa endlessly, because you never really actually mark any VM to be dead just like OOM does to the normal enclaves. >From this point, you still need a way to kill a VM, IIUC. I think the key point of virtual EPC vs cgroup, as quoted from Sean, should be "having sane, well-defined behavior". Does "just remove enough EPC pages" meet this? If the problem mentioned above can be avoided, I suppose so? So if there's an easy way to achieve, I guess it can be an option too. But for the initial support, IMO we are not looking for a perfect but yet complicated solution. I would say, from review's point of view, it's preferred to have a simple implementation to achieve a not-prefect, but consistent, well- defined behaviour. So to me looks killing the VM when cgroup cannot reclaim any more EPC pages is a simple option. But I might have missed something, especially since middle of last week I have been having fever and headache :-) So as mentioned above, you can try other alternatives, but please avoid complicated ones. Also, I guess it will be helpful if we can understand the typical SGX app and/or SGX VM deployment under EPC cgroup use case. This may help us on justifying why the EPC cgroup algorithm to select victim is reasonable.