Re: [PATCH v9 08/15] x86/sgx: Implement EPC reclamation flows for cgroup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 21 Feb 2024 04:48:58 -0600, Huang, Kai <kai.huang@xxxxxxxxx> wrote:

On Wed, 2024-02-21 at 00:23 -0600, Haitao Huang wrote:
StartHi Kai
On Tue, 20 Feb 2024 03:52:39 -0600, Huang, Kai <kai.huang@xxxxxxxxx> wrote:
[...]
>
> So you introduced the work/workqueue here but there's no place which
> actually
> queues the work.  IMHO you can either:
>
> 1) move relevant code change here; or
> 2) focus on introducing core functions to reclaim certain pages from a
> given EPC
> cgroup w/o workqueue and introduce the work/workqueue in later patch.
>
> Makes sense?
>

Starting in v7, I was trying to split the big patch, #10 in v6 as you and others suggested. My thought process was to put infrastructure needed for
per-cgroup reclaim in the front, then turn on per-cgroup reclaim in [v9
13/15] in the end.

That's reasonable for sure.


Thanks for the confirmation :-)


Before that, all reclaimables are tracked in the global LRU so really
there is no "reclaim certain pages from a given EPC cgroup w/o workqueue"
or reclaim through workqueue before that point, as suggested in #2. This
patch puts down the implementation for both flows but neither used yet, as
stated in the commit message.

I know it's not used yet. The point is how to split patches to make them more
self-contain and easy to review.

I would think this patch already self-contained in that all are implementation of cgroup reclamation building blocks utilized later. But I'll try to follow your suggestions below to split further (would prefer not to merge in general unless there is strong reasons).


For #2, sorry for not being explicit -- I meant it seems it's more reasonable to
split in this way:

Patch 1)
  a). change to sgx_reclaim_pages();

I'll still prefer this to be a separate patch. It is self-contained IMHO.
We were splitting the original patch because it was too big. I don't want to merge back unless there is a strong reason.

  b). introduce sgx_epc_cgroup_reclaim_pages();

Ok.

c). introduce sgx_epc_cgroup_reclaim_work_func() (use a better name), which just takes an EPC cgroup as input w/o involving any work/workqueue.

This is for the workqueue use only. So I think it'd be better be with patch #2 below?


These functions are all related to how to implement reclaiming pages from a given EPC cgroup, and they are logically related in terms of implementation thus
it's easier to be reviewed together.


This is pretty much the current patch + sgx_reclaim_pages() - workqueue.

Then you just need to justify the design/implementation in changelog/comments.


How about just do b) in patch #1, and state the new function is the building block and will be used for both direct and indirect reclamation?

Patch 2)
  - Introduce work/workqueue, and implement the logic to queue the work.

Now we all know there's a function to reclaim pages for a given EPC cgroup, then
we can focus on when that is called, either directly or indirectly.
	

The try_charge() function will do both actually.
For indirect, it queue the work to the wq. For direct it just calls sgx_epc_cgroup_reclaim_pages(). That part is already in separate (I think self-contained) patch [v9, 10/15].

So for this patch, I'll add sgx_epc_cgroup_reclaim_work_func() and introduce work/workqueue so later work can be queued?


#1 would force me go back and merge the patches again.

I don't think so. I am not asking to put all things together, but only asking
to split in better way (that I see).


Okay.

You mentioned some function is "Scheduled by sgx_epc_cgroup_try_charge() to reclaim pages", but I am not seeing any code doing that in this patch. This needs fixing, either by moving relevant code here, or removing these not-done-
yet comments.


Yes. The comments will be fixed.

For instance (I am just giving an example), if after review we found the
queue_work() shouldn't be done in try_charge(), you will need to go back to this
patch and remove these comments.

That's not the best way.  Each patch needs to be self-contained.


Sorry I feel kind of lost on this whole thing by now. It seems so random
to me. Is there hard rules on this?

See above. I am only offering my opinion on how to split patches in better way.


To be honest, the part I'm feeling most confusing is this self-contained-ness. It seems depend on how you look at things.


I was hoping these statements would help reviewers on the flow of the
patches.

At the end of [v9 04/15]:

For now, the EPC cgroup simply blocks additional EPC allocation in
sgx_alloc_epc_page() when the limit is reached. Reclaimable pages are
still tracked in the global active list, only reclaimed by the global
reclaimer when the total free page count is lower than a threshold.

Later patches will reorganize the tracking and reclamation code in the
global reclaimer and implement per-cgroup tracking and reclaiming.

At the end of [v9 06/15]:

Next patches will first get the cgroup reclamation flow ready while
keeping pages tracked in the global LRU and reclaimed by ksgxd before we
make the switch in the end for sgx_lru_list() to return per-cgroup
LRU.

At the end of [v9 08/15]:

Both synchronous and asynchronous flows invoke the same top level reclaim
function, and will be triggered later by sgx_epc_cgroup_try_charge()
when usage of the cgroup is at or near its limit.

At the end of [v9 10/15]:
Note at this point, all reclaimable EPC pages are still tracked in the
global LRU and per-cgroup LRUs are empty. So no per-cgroup reclamation
is activated yet.

They are useful in the changelog in each patch I suppose, but to me we are
discussing different things.

I found one pain in the review is I have to jump back and forth many times among multiple patches to see whether one patch is reasonable. That's why I am asking whether there's better way to split patches so that each patch can be self-
contained logically in someway and easier to review.


I appreciate very much your time and effort on providing detailed review. You have been very helpful. If you think it makes sense, I'll split this patch into 2 with stated modifications above.

Thanks
Haitao




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux