Re: [PATCH v15 05/14] x86/sgx: Implement basic EPC misc cgroup functionality

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 18 Jun 2024 06:31:09 -0500, Huang, Kai <kai.huang@xxxxxxxxx> wrote:


@@ -921,7 +956,8 @@ static int __init sgx_init(void)
 	if (!sgx_page_cache_init())
 		return -ENOMEM;

-	if (!sgx_page_reclaimer_init()) {
+	if (!sgx_page_reclaimer_init() || !sgx_cgroup_init()) {
+		misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, 0);
 		ret = -ENOMEM;
 		goto err_page_cache;
 	}

This code change is wrong due to two reasons:

1) If sgx_page_reclaimer_init() was successful, but sgx_cgroup_init()
failed, you actually need to 'goto err_kthread' because the ksgxd() kernel
thread is already created and is running.

2) There are other cases after here that can also result in sgx_init() to
fail completely, e.g., registering sgx_dev_provision mics device. We need
to reset the capacity to 0 for those cases as well.

AFAICT, you need something like:

diff --git a/arch/x86/kernel/cpu/sgx/main.c
b/arch/x86/kernel/cpu/sgx/main.c
index 27892e57c4ef..46f9c26992a7 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -930,6 +930,10 @@ static int __init sgx_init(void)
        if (ret)
                goto err_kthread;
+       ret = sgx_cgroup_init();
+       if (ret)
+               goto err_provision;
+
        /*
         * Always try to initialize the native *and* KVM drivers.
         * The KVM driver is less picky than the native one and
@@ -941,10 +945,12 @@ static int __init sgx_init(void)
        ret = sgx_drv_init();
       if (sgx_vepc_init() && ret)
-               goto err_provision;
+               goto err_cgroup;
       return 0;
+err_cgroup:
+       /* SGX EPC cgroup cleanup */
 err_provision:
        misc_deregister(&sgx_dev_provision);
@@ -952,6 +958,8 @@ static int __init sgx_init(void)
        kthread_stop(ksgxd_tsk);
err_page_cache:
+       misc_misc_cg_set_capacity(MISC_CG_RES_SGX_EPC, 0);
+
        for (i = 0; i < sgx_nr_epc_sections; i++) {
                vfree(sgx_epc_sections[i].pages);
                memunmap(sgx_epc_sections[i].virt_addr);


I put the sgx_cgroup_init() before sgx_drv_init() and sgx_vepc_init(),
otherwise you will need sgx_drv_cleanup() and sgx_vepc_cleanup()
respectively when sgx_cgroup_init() fails.


Yes, good catch.

This looks a little bit weird too, though:

Calling misc_misc_cg_set_capacity() to reset capacity to 0 is done at end
of sgx_init() error path, because the "set capacity" part is done in
sgx_epc_cache_init(). But logically, both "set capacity" and "reset capacity to 0" should be SGX
EPC cgroup operation, so it's more reasonable to do "set capacity" in
sgx_cgroup_init() and do "reset to 0" in the

	/* SGX EPC cgroup cleanup */

as shown above.

Eventually, you will need to do EPC cgroup cleanup anyway, e.g., to free
the workqueue, so it's odd to have two places to handle EPC cgroup
cleanup.

I understand the reason "set capacity" part is done in
sgx_page_cache_init() now is because in that function you can easily get
the capacity.  But the fact is @sgx_numa_nodes also tracks EPC size for
each node, so you can also get the total EPC size from @sgx_numa_node in
sgx_cgroup_init() and set capacity there.

In this case, you can put "reset capacity to 0" and "free workqueue"
together as the "SGX EPC cgroup cleanup", which is way more clear IMHO.

Okay, will expose @sgx_numa_nodes to epc_cgroup.c and do the calculations in sgx_cgroup_init().
Thanks
Haitao




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux