On Wed, 21 Feb 2024 04:48:58 -0600, Huang, Kai <kai.huang@xxxxxxxxx> wrote:
On Wed, 2024-02-21 at 00:23 -0600, Haitao Huang wrote:
StartHi Kai
On Tue, 20 Feb 2024 03:52:39 -0600, Huang, Kai <kai.huang@xxxxxxxxx>
wrote:
[...]
>
> So you introduced the work/workqueue here but there's no place which
> actually
> queues the work. IMHO you can either:
>
> 1) move relevant code change here; or
> 2) focus on introducing core functions to reclaim certain pages from a
> given EPC
> cgroup w/o workqueue and introduce the work/workqueue in later patch.
>
> Makes sense?
>
Starting in v7, I was trying to split the big patch, #10 in v6 as you
and
others suggested. My thought process was to put infrastructure needed
for
per-cgroup reclaim in the front, then turn on per-cgroup reclaim in [v9
13/15] in the end.
That's reasonable for sure.
Thanks for the confirmation :-)
Before that, all reclaimables are tracked in the global LRU so really
there is no "reclaim certain pages from a given EPC cgroup w/o
workqueue"
or reclaim through workqueue before that point, as suggested in #2. This
patch puts down the implementation for both flows but neither used yet,
as
stated in the commit message.
I know it's not used yet. The point is how to split patches to make
them more
self-contain and easy to review.
I would think this patch already self-contained in that all are
implementation of cgroup reclamation building blocks utilized later. But
I'll try to follow your suggestions below to split further (would prefer
not to merge in general unless there is strong reasons).
For #2, sorry for not being explicit -- I meant it seems it's more
reasonable to
split in this way:
Patch 1)
a). change to sgx_reclaim_pages();
I'll still prefer this to be a separate patch. It is self-contained IMHO.
We were splitting the original patch because it was too big. I don't want
to merge back unless there is a strong reason.
b). introduce sgx_epc_cgroup_reclaim_pages();
Ok.
c). introduce sgx_epc_cgroup_reclaim_work_func() (use a better name),
which just takes an EPC cgroup as input w/o involving any
work/workqueue.
This is for the workqueue use only. So I think it'd be better be with
patch #2 below?
These functions are all related to how to implement reclaiming pages
from a
given EPC cgroup, and they are logically related in terms of
implementation thus
it's easier to be reviewed together.
This is pretty much the current patch + sgx_reclaim_pages() - workqueue.
Then you just need to justify the design/implementation in
changelog/comments.
How about just do b) in patch #1, and state the new function is the
building block and will be used for both direct and indirect reclamation?
Patch 2)
- Introduce work/workqueue, and implement the logic to queue the work.
Now we all know there's a function to reclaim pages for a given EPC
cgroup, then
we can focus on when that is called, either directly or indirectly.
The try_charge() function will do both actually.
For indirect, it queue the work to the wq. For direct it just calls
sgx_epc_cgroup_reclaim_pages().
That part is already in separate (I think self-contained) patch [v9,
10/15].
So for this patch, I'll add sgx_epc_cgroup_reclaim_work_func() and
introduce work/workqueue so later work can be queued?
#1 would force me go back and merge the patches again.
I don't think so. I am not asking to put all things together, but only
asking
to split in better way (that I see).
Okay.
You mentioned some function is "Scheduled by sgx_epc_cgroup_try_charge()
to
reclaim pages", but I am not seeing any code doing that in this patch.
This
needs fixing, either by moving relevant code here, or removing these
not-done-
yet comments.
Yes. The comments will be fixed.
For instance (I am just giving an example), if after review we found the
queue_work() shouldn't be done in try_charge(), you will need to go back
to this
patch and remove these comments.
That's not the best way. Each patch needs to be self-contained.
Sorry I feel kind of lost on this whole thing by now. It seems so random
to me. Is there hard rules on this?
See above. I am only offering my opinion on how to split patches in
better way.
To be honest, the part I'm feeling most confusing is this
self-contained-ness. It seems depend on how you look at things.
I was hoping these statements would help reviewers on the flow of the
patches.
At the end of [v9 04/15]:
For now, the EPC cgroup simply blocks additional EPC allocation in
sgx_alloc_epc_page() when the limit is reached. Reclaimable pages are
still tracked in the global active list, only reclaimed by the global
reclaimer when the total free page count is lower than a threshold.
Later patches will reorganize the tracking and reclamation code in the
global reclaimer and implement per-cgroup tracking and reclaiming.
At the end of [v9 06/15]:
Next patches will first get the cgroup reclamation flow ready while
keeping pages tracked in the global LRU and reclaimed by ksgxd before we
make the switch in the end for sgx_lru_list() to return per-cgroup
LRU.
At the end of [v9 08/15]:
Both synchronous and asynchronous flows invoke the same top level
reclaim
function, and will be triggered later by sgx_epc_cgroup_try_charge()
when usage of the cgroup is at or near its limit.
At the end of [v9 10/15]:
Note at this point, all reclaimable EPC pages are still tracked in the
global LRU and per-cgroup LRUs are empty. So no per-cgroup reclamation
is activated yet.
They are useful in the changelog in each patch I suppose, but to me we
are
discussing different things.
I found one pain in the review is I have to jump back and forth many
times among
multiple patches to see whether one patch is reasonable. That's why I
am asking
whether there's better way to split patches so that each patch can be
self-
contained logically in someway and easier to review.
I appreciate very much your time and effort on providing detailed review.
You have been very helpful.
If you think it makes sense, I'll split this patch into 2 with stated
modifications above.
Thanks
Haitao