On Tue, 2024-04-23 at 19:26 -0500, Haitao Huang wrote: > On Tue, 23 Apr 2024 17:13:15 -0500, Huang, Kai <kai.huang@xxxxxxxxx> wrote: > > > On Tue, 2024-04-23 at 10:30 -0500, Haitao Huang wrote: > > > > > It's a workaround because you use the capacity==0 but it does not > > > really > > > > > mean to disable the misc cgroup for specific resource IIUC. > > > > > > > > Please read the comment around @misc_res_capacity again: > > > > > > > > * Miscellaneous resources capacity for the entire machine. 0 > > > capacity > > > > * means resource is not initialized or not present in the host. > > > > > > > > > > I mentioned this in earlier email. I think this means no SGX EPC. It > > > doesnot mean sgx epc cgroup not enabled. That's also consistent with the > > > behavior try_charge() fails if capacity is zero. > > > > OK. To me the "capacity" is purely the concept of cgroup, so it must be > > interpreted within the scope of "cgroup". If cgroup, in our case, SGX > > cgroup, is disabled, then whether "leaving the capacity to reflect the > > presence of hardware resource" doesn't really matter. > > So what you are saying is that, the kernel must initialize the capacity > > of > > some MISC resource once it is added as valid type. > > And you must initialize the "capacity" even MISC cgroup is disabled > > entirely by kernel commandline, in which case, IIUC, misc.capacity is not > > even going to show in the /fs. > > > > If this is your point, then your patch: > > > > cgroup/misc: Add SGX EPC resource type > > > > is already broken, because you added the new type w/o initializing the > > capacity. > > > > Please fix that up. > > > > > > > > > > > > > > > There is explicit way for user to disable misc without setting > > > capacity> > to > > > > > zero. > > > > > > > > Which way are you talking about? > > > > > > Echo "-misc" to cgroup.subtree_control at root level for example still > > > shows non-zero sgx_epc capacity. > > > > I guess "having to disable all MISC resources just in order to disable > > SGX > > EPC cgroup" is a brilliant idea. > > > > You can easily disable the entire MISC cgroup by commandline for that > > purpose if that's acceptable. > > > > And I have no idea why "still showing non-zero EPC capacity" is important > > if SGX cgroup cannot be supported at all. > > > > Okay, all I'm trying to say is we should care about consistency in code > and don't want SGX do something different. Mixing "disable" with > "capacity==0" causes inconsistencies AFAICS: > > 1) The try_charge() API currently returns error when capacity is zero. So > it appears not to mean that the cgroup is disabled otherwise it should > return success. I agree this isn't ideal. My view is we can fix it in MISC code if needed. > > 2) The current explicit way ("-misc") to disable misc still shows non-zero > entries in misc.capacity. (At least for v2 cgroup, it does when I tested). > Maybe this is not important but I just don't feel good about this > inconsistency. This belongs to "MISC resource cgroup was initially enabled by the kernel at boot time, but later was disabled *somewhere in hierarchy* by the user". In fact, if you only do "-misc" for "some subtree", it's quite reasonable to still report the resource in max.capacity. In the case above, the "some subtree" happens to be the root. So to me it's reasonable to still show max.capacity in this case. And you can actually argue that the kernel still supports the cgroup for the resource. E.g., you can at runtime do "+misc" to re-enable. However, if the kernel isn't able to support certain MISC resource cgroup at boot time, it's quite reasonable to just set the "capacity" to 0 so it isn't visible to userspace. Note: My key point is, when userspace sees 0 "capacity", it shouldn't need to care about whether it is because of "hardware resource is not available", or "hardware resource is available but kernel cannot support cgroup for it". The resource cgroup is simply unavailable. That means the kernel has full right to just hide that resource from the cgroup at boot time. But this should be just within "cgroup's scope", i.e., that resource can still be available if kernel can provide it. If some app wants to additionally check whether such resource is indeed available but only cgroup is not available, it should check resource specific interface but not take advantage of the MISC cgroup interface. > > For now I'll just do BUG_ON() unless there are more strong opinions one > way or the other. > Fine to me.