Hi Boris, thanks for taking look. On Mon, 2021-12-20 at 20:30 +0100, Borislav Petkov wrote: > On Mon, Dec 20, 2021 at 09:46:39AM -0800, Kristen Carlson Accardi > wrote: > > Similar to the core kswapd, ksgxd, is responsible for managing the > > overcommitment of enclave memory. If the system runs out of > > enclave memory, > > -*ksgxd* “swaps” enclave memory to normal memory. > > +*ksgxd* “swaps” enclave memory to normal RAM. This normal RAM is > > allocated > > +via per enclave shared memory. The shared memory area is not > > mapped into the > > +enclave or the task mapping it, which makes its memory use opaque > > - including > > +to the system out of memory killer (OOM). This can be problematic > > when there > > +are no limits in place on the amount an enclave can allocate. > > Problematic how? If a malicious or just extra large enclave is loaded, or even just a lot of enclaves, it can eat up all the normal RAM on the system. Normal methods of finding out where all the memory on the system is being used, wouldn't be able to find this usage since it is shared memory. In addition, the OOM killer wouldn't be able to kill any enclaves. > > The commit message above is talking about what your patch does and > that > is kinda clear from the diff. The *why* is really missing. Only that > allusion that it might be problematic in some cases but that's not > even > scratching the surface. > > > +At boot time, the module parameter "sgx.overcommit_percent" can be > > used to > > +place a limit on the number of shared memory backing pages that > > may be > > +allocated, expressed as a percentage of the total number of EPC > > pages in the > > +system. A value of 100 is the default, and represents a limit > > equal to the > > +number of EPC pages in the system. To disable the limit, set > > +sgx.overcommit_percent to -1. The number of backing pages > > available to > > +enclaves is a global resource. If the system exceeds the number of > > allowed > > +backing pages in use, the reclaimer will be unable to swap EPC > > pages to > > +shared memory. > > So you're basically putting the burden on the user/sysadmin to > *actually* *know* what percentage is "problematic" and to know what > to > supply. I'd bet not very many people would know how much is > problematic > and it probably all depends. > > So why don't you come up with a sane default, instead, which works in > most cases and set it automatically? The default value is set to 100%, and this percentage, plus the number of EPC pages in the system is used to calculate a sane value for the number of backing pages to add - in this case, exactly the number of EPC pages in the system. It is set automatically if the parameter is not used to override it. > > Dunno, maybe some scaled percentage of memory depending on how many > enclaves are run but all up to a sane limit of, say, 90% of total > memory > so that there are 10% left for normal system operation. I think in this case you are still making a judgement call for the admin about how much memory you think they ought to be able to use for non-SGX operations, and it feels more natural to me to have the default based on how many backing pages you might reasonably need. I suppose one could check the total system memory available and double check that the calculated number of backing pages you'd give out would never exceed some percentage of total system RAM, but then you wind up with 2 knobs to play with instead of just one, which seems wrong to me. > > This way you'll avoid "problematic" and still have some memory left > for > other use. > > Or something like that. > > Adding yet another knob is yuck and the easy way out. And we have > waaaay > too many knobs so we should always try to do the automatic thing, if > at > all possible. I completely agree - so I'm trying to make sure I understand this comment, as the value is currently set to default that would automatically apply that is based on EPC memory present and not a fixed value. So I'd like to understand what you'd like to see done differently. are you saying you'd like to eliminated the ability to override the automatic default? Or just that you'd rather calculate the percentage based on total normal system RAM rather than EPC memory? Thanks, Kristen