On Tue, 18 Jun 2024 06:31:09 -0500, Huang, Kai <kai.huang@xxxxxxxxx>
wrote:
>
> > @@ -921,7 +956,8 @@ static int __init sgx_init(void)
> > if (!sgx_page_cache_init())
> > return -ENOMEM;
> >
> > - if (!sgx_page_reclaimer_init()) {
> > + if (!sgx_page_reclaimer_init() || !sgx_cgroup_init()) {
> > + misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, 0);
> > ret = -ENOMEM;
> > goto err_page_cache;
> > }
>
> This code change is wrong due to two reasons:
>
> 1) If sgx_page_reclaimer_init() was successful, but sgx_cgroup_init()
> failed, you actually need to 'goto err_kthread' because the ksgxd()
> kernel
> thread is already created and is running.
>
> 2) There are other cases after here that can also result in
sgx_init() to
> fail completely, e.g., registering sgx_dev_provision mics device. We
> need
> to reset the capacity to 0 for those cases as well.
>
> AFAICT, you need something like:
>
> diff --git a/arch/x86/kernel/cpu/sgx/main.c
> b/arch/x86/kernel/cpu/sgx/main.c
> index 27892e57c4ef..46f9c26992a7 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -930,6 +930,10 @@ static int __init sgx_init(void)
> if (ret)
> goto err_kthread;
> + ret = sgx_cgroup_init();
> + if (ret)
> + goto err_provision;
> +
> /*
> * Always try to initialize the native *and* KVM drivers.
> * The KVM driver is less picky than the native one and
> @@ -941,10 +945,12 @@ static int __init sgx_init(void)
> ret = sgx_drv_init();
> if (sgx_vepc_init() && ret)
> - goto err_provision;
> + goto err_cgroup;
> return 0;
> +err_cgroup:
> + /* SGX EPC cgroup cleanup */
> err_provision:
> misc_deregister(&sgx_dev_provision);
> @@ -952,6 +958,8 @@ static int __init sgx_init(void)
> kthread_stop(ksgxd_tsk);
> err_page_cache:
> + misc_misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, 0);
> +
> for (i = 0; i < sgx_nr_epc_sections; i++) {
> vfree(sgx_epc_sections[i].pages);
> memunmap(sgx_epc_sections[i].virt_addr);
>
>
> I put the sgx_cgroup_init() before sgx_drv_init() and sgx_vepc_init(),
> otherwise you will need sgx_drv_cleanup() and sgx_vepc_cleanup()
> respectively when sgx_cgroup_init() fails.
>
Yes, good catch.
> This looks a little bit weird too, though:
>
> Calling misc_misc_cg_set_capacity() to reset capacity to 0 is done at
end
> of sgx_init() error path, because the "set capacity" part is done in
> sgx_epc_cache_init().
> But logically, both "set capacity" and "reset capacity to 0" should be
> SGX
> EPC cgroup operation, so it's more reasonable to do "set capacity" in
> sgx_cgroup_init() and do "reset to 0" in the
>
> /* SGX EPC cgroup cleanup */
>
> as shown above.
>
> Eventually, you will need to do EPC cgroup cleanup anyway, e.g., to
free
> the workqueue, so it's odd to have two places to handle EPC cgroup
> cleanup.
>
> I understand the reason "set capacity" part is done in
> sgx_page_cache_init() now is because in that function you can easily
get
> the capacity. But the fact is @sgx_numa_nodes also tracks EPC size
for
> each node, so you can also get the total EPC size from @sgx_numa_node
in
> sgx_cgroup_init() and set capacity there.
>
> In this case, you can put "reset capacity to 0" and "free workqueue"
> together as the "SGX EPC cgroup cleanup", which is way more clear
IMHO.
>
Okay, will expose @sgx_numa_nodes to epc_cgroup.c and do the
calculations
in sgx_cgroup_init().