Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup try_charge()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 13 Feb 2024 19:52:25 -0600, Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote:

On Tue Feb 13, 2024 at 1:15 AM EET, Haitao Huang wrote:
Hi Jarkko

On Mon, 12 Feb 2024 13:55:46 -0600, Jarkko Sakkinen <jarkko@xxxxxxxxxx>
wrote:

> On Mon Feb 5, 2024 at 11:06 PM EET, Haitao Huang wrote:
>> From: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
>>
>> When the EPC usage of a cgroup is near its limit, the cgroup needs to
>> reclaim pages used in the same cgroup to make room for new allocations. >> This is analogous to the behavior that the global reclaimer is triggered
>> when the global usage is close to total available EPC.
>>
>> Add a Boolean parameter for sgx_epc_cgroup_try_charge() to indicate
>> whether synchronous reclaim is allowed or not. And trigger the
>> synchronous/asynchronous reclamation flow accordingly.
>>
>> Note at this point, all reclaimable EPC pages are still tracked in the >> global LRU and per-cgroup LRUs are empty. So no per-cgroup reclamation
>> is activated yet.
>>
>> Co-developed-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>> Signed-off-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
>> Co-developed-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
>> ---
>> V7:
>> - Split this out from the big patch, #10 in V6. (Dave, Kai)
>> ---
>>  arch/x86/kernel/cpu/sgx/epc_cgroup.c | 26 ++++++++++++++++++++++++--
>>  arch/x86/kernel/cpu/sgx/epc_cgroup.h |  4 ++--
>>  arch/x86/kernel/cpu/sgx/main.c       |  2 +-
>>  3 files changed, 27 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> index d399fda2b55e..abf74fdb12b4 100644
>> --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
>> @@ -184,13 +184,35 @@ static void
>> sgx_epc_cgroup_reclaim_work_func(struct work_struct *work)
>>  /**
>> * sgx_epc_cgroup_try_charge() - try to charge cgroup for a single EPC
>> page
>>   * @epc_cg:	The EPC cgroup to be charged for the page.
>> + * @reclaim:	Whether or not synchronous reclaim is allowed
>>   * Return:
>>   * * %0 - If successfully charged.
>>   * * -errno - for failures.
>>   */
>> -int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
>> +int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg, bool
>> reclaim)
>>  {
>> - return misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg, PAGE_SIZE);
>> +	for (;;) {
>> +		if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
>> +					PAGE_SIZE))
>> +			break;
>> +
>> +		if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
>> +			return -ENOMEM;
>> + +		if (signal_pending(current))
>> +			return -ERESTARTSYS;
>> +
>> +		if (!reclaim) {
>> +			queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);
>> +			return -EBUSY;
>> +		}
>> +
>> +		if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, false))
>> + /* All pages were too young to reclaim, try again a little later */
>> +			schedule();
>
> This will be total pain to backtrack after a while when something
> needs to be changed so there definitely should be inline comments
> addressing each branch condition.
>
> I'd rethink this as:
>
> 1. Create static __sgx_epc_cgroup_try_charge() for addressing single
>    iteration with the new "reclaim" parameter.
> 2. Add a new sgx_epc_group_try_charge_reclaim() function.
>
> There's a bit of redundancy with sgx_epc_cgroup_try_charge() and
> sgx_epc_cgroup_try_charge_reclaim() because both have almost the
> same loop calling internal __sgx_epc_cgroup_try_charge() with
> different parameters. That is totally acceptable.
>
> Please also add my suggested-by.
>
> BR, Jarkko
>
> BR, Jarkko
>
For #2:
The only caller of this function, sgx_alloc_epc_page(), has the same
boolean which is passed into this this function.

I know. This would be good opportunity to fix that up. Large patch
sets should try to make the space for its feature best possible and
thus also clean up the code base overally.

If we separate it into sgx_epc_cgroup_try_charge() and
sgx_epc_cgroup_try_charge_reclaim(), then the caller has to have the
if/else branches. So separation here seems not help?

Of course it does. It makes the code in that location self-documenting
and easier to remember what it does.

BR, Jarkko


Please let me know if this aligns with your suggestion.


static int ___sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
{
        if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
                                        PAGE_SIZE))
                return 0;

        if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
                return -ENOMEM;

        if (signal_pending(current))
                return -ERESTARTSYS;

        return -EBUSY;
}

/**
 * sgx_epc_cgroup_try_charge() - try to charge cgroup for a single page
 * @epc_cg:     The EPC cgroup to be charged for the page.
 *
* Try to reclaim pages in the background if the group reaches its limit and
 * there are reclaimable pages in the group.
 * Return:
 * * %0 - If successfully charged.
 * * -errno - for failures.
 */
int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
{
        int ret =  ___sgx_epc_cgroup_try_charge(epc_cg);

        if (ret == -EBUSY)
                queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);

        return ret;
}

/**
* sgx_epc_cgroup_try_charge_reclaim() - try to charge cgroup for a single page
 * @epc_cg:     The EPC cgroup to be charged for the page.
 *
* Try to reclaim pages directly if the group reaches its limit and there are
 * reclaimable pages in the group.
 * Return:
 * * %0 - If successfully charged.
 * * -errno - for failures.
 */
int sgx_epc_cgroup_try_charge_reclaim(struct sgx_epc_cgroup *epc_cg)
{
        int ret;

        for (;;) {
                ret =  ___sgx_epc_cgroup_try_charge(epc_cg);
                if (ret != -EBUSY)
                        return ret;

                if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, current->mm))
/* All pages were too young to reclaim, try again a little later */
                        schedule();
        }

        return 0;
}

It is a little more involved to remove the boolean for sgx_alloc_epc_page() and its callers like sgx_encl_grow(), sgx_alloc_va_page(). I'll send a separate patch for comments.

Thanks
Haitao




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux