On Tue, 2024-04-23 at 08:08 -0500, Haitao Huang wrote:
On Mon, 22 Apr 2024 17:16:34 -0500, Huang, Kai <kai.huang@xxxxxxxxx>
wrote:
> On Mon, 2024-04-22 at 11:17 -0500, Haitao Huang wrote:
> > On Sun, 21 Apr 2024 19:22:27 -0500, Huang, Kai <kai.huang@xxxxxxxxx>
> > wrote:
> >
> > > On Fri, 2024-04-19 at 20:14 -0500, Haitao Huang wrote:
> > > > > > I think we can add support for "sgx_cgroup=disabled" in
future
> > if
> > > > indeed
> > > > > > needed. But just for init failure, no?
> > > > > >
> > > > >
> > > > > It's not about the commandline, which we can add in the future
> > when
> > > > > needed. It's about we need to have a way to handle SGX cgroup
> > being
> > > > > disabled at boot time nicely, because we already have a case
> > where we
> > > > > need
> > > > > to do so.
> > > > >
> > > > > Your approach looks half-way to me, and is not future
> > extendible. If
> > > > we
> > > > > choose to do it, do it right -- that is, we need a way to
disable
> > it
> > > > > completely in both kernel and userspace so that userspace
won't be
> > > > able> to
> > > > > see it.
> > > >
> > > > That would need more changes in misc cgroup implementation to
> > support
> > > > sgx-disable. Right now misc does not have separate files for
> > different
> > > > resource types. So we can only block echo "sgx_epc..." to those
> > > > interfacefiles, can't really make files not visible.
> > >
> > > "won't be able to see" I mean "only for SGX EPC resource", but
not the
> > > control files for the entire MISC cgroup.
> > >
> > > I replied at the beginning of the previous reply:
> > >
> > > "
> > > Given SGX EPC is just one type of MISC cgroup resources, we cannot
> > just
> > > disable MISC cgroup as a whole.
> > > "
> > >
> > Sorry I missed this point. below.
> >
> > > You just need to set the SGX EPC "capacity" to 0 to disable SGX
EPC.
> > See
> > > the comment of @misc_res_capacity:
> > >
> > > * Miscellaneous resources capacity for the entire machine. 0
capacity
> > > * means resource is not initialized or not present in the host.
> > >
> >
> > IIUC I don't think the situation we have is either of those cases.
For
> > our
> > case, resource is inited and present on the host but we have
allocation
> > error for sgx cgroup infra.
>
> You have calculated the "capacity", but later you failed something and
> then reset the "capacity" to 0, i.e., cleanup. What's wrong with
that?
>
> >
> > > And "blocking echo sgx_epc ... to those control files" is already
> > > sufficient for the purpose of not exposing SGX EPC to userspace,
> > correct?
> > >
> > > E.g., if SGX cgroup is enabled, you can see below when you read
"max":
> > >
> > > # cat /sys/fs/cgroup/my_group/misc.max
> > > # <resource1> <max1>
> > > sgx_epc ...
> > > ...
> > >
> > > Otherwise you won't be able to see "sgx_epc":
> > >
> > > # cat /sys/fs/cgroup/my_group/misc.max
> > > # <resource1> <max1>
> > > ...
> > >
> > > And when you try to write the "max" for "sgx_epc", you will hit
error:
> > >
> > > # echo "sgx_epc 100" > /sys/fs/cgroup/my_group/misc.max
> > > # ... echo: write error: Invalid argument
> > >
> > > The above applies to all the control files. To me this is pretty
much
> > > means "SGX EPC is disabled" or "not supported" for userspace.
> > >
> > You are right, capacity == 0 does block echoing max and users see an
> > error
> > if they do that. But 1) doubt you literately wanted "SGX EPC is
> > disabled"
> > and make it unsupported in this case,
>
> I don't understand. Something failed during SGX cgroup
initialization,
> you _literally_ cannot continue to support it.
>
>
Then we should just return -ENOMEM from sgx_init() when sgx cgroup
initialization fails?
I thought we only disable SGX cgroup support. SGX can still run.
I am not sure how you got this conclusion. I specifically said something
failed during SGX "cgroup" initialization, so only SGX "cgroup" needs to
be disabled, not SGX as a whole.
> > 2) even if we accept this is "sgx
> > cgroup disabled" I don't see how it is much better user experience
than
> > current solution or really helps user better.
>
> In your way, the userspace is still able to see "sgx_epc" in control
> files
> and is able to update them. So from userspace's perspective SGX
cgroup
> is
> enabled, but obviously updating to "max" doesn't have any impact.
This
> will confuse userspace.
>
> >
Setting capacity to zero also confuses user space. Some application may
rely on this file to know the capacity.
Why??
Are you saying before this SGX cgroup patchset those applications cannot
run?
> > Also to implement this approach, as you mentioned, we need
workaround
> > the
> > fact that misc_try_charge() fails when capacity set to zero, and
adding
> > code to return root always?
>
> Why this is a problem?
>
It changes/overrides the the original meaning of capacity==0: no one can
allocate if capacity is zero.
Why??
Are you saying before this series, no one can allocate EPC page?
> > So it seems like more workaround code to just
> > make it work for a failing case no one really care much and end
result
> > is
> > not really much better IMHO.
>
> It's not workaround, it's the right thing to do.
>
> The result is userspace will see it being disabled when kernel
disables
> it.
>
>
It's a workaround because you use the capacity==0 but it does not really
mean to disable the misc cgroup for specific resource IIUC.
Please read the comment around @misc_res_capacity again:
* Miscellaneous resources capacity for the entire machine. 0 capacity
* means resource is not initialized or not present in the host.