On Mon, 2023-11-27 at 00:27 +0800, Haitao Huang wrote: > On Mon, 20 Nov 2023 11:45:46 +0800, Huang, Kai <kai.huang@xxxxxxxxx> wrote: > > > On Mon, 2023-10-30 at 11:20 -0700, Haitao Huang wrote: > > > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > > > > To prepare for per-cgroup reclamation, separate the top-level reclaim > > > function, sgx_reclaim_epc_pages(), into two separate functions: > > > > > > - sgx_isolate_epc_pages() scans and isolates reclaimable pages from a > > > given LRU list. > > > - sgx_do_epc_reclamation() performs the real reclamation for the > > > already isolated pages. > > > > > > Create a new function, sgx_reclaim_epc_pages_global(), calling those two > > > in succession, to replace the original sgx_reclaim_epc_pages(). The > > > above two functions will serve as building blocks for the reclamation > > > flows in later EPC cgroup implementation. > > > > > > sgx_do_epc_reclamation() returns the number of reclaimed pages. The EPC > > > cgroup will use the result to track reclaiming progress. > > > > > > sgx_isolate_epc_pages() returns the additional number of pages to scan > > > for current epoch of reclamation. The EPC cgroup will use the result to > > > determine if more scanning to be done in LRUs in its children groups. > > > > This changelog says nothing about "why", but only mentions the > > "implementation". > > > > For instance, assuming we need to reclaim @npages_to_reclaim from the > > @epc_cgrp_to_reclaim and its descendants, why cannot we do: > > > > for_each_cgroup_and_descendants(&epc_cgrp_to_reclaim, &epc_cgrp) { > > if (npages_to_reclaim <= 0) > > return; > > > > npages_to_reclaim -= sgx_reclaim_pages_lru(&epc_cgrp->lru, > > npages_to_reclaim); > > } > > > > Is there any difference to have "isolate" + "reclaim"? > > > > This is to optimize "reclaim". See how etrack was done in sgx_encl_ewb. > Here we just follow the same design as ksgxd for each reclamation cycle. I don't see how did you "follow" ksgxd. If I am guessing correctly, you are afraid of there might be less than 16 pages in a given EPC cgroup, thus w/o splitting into "isolate" + "reclaim" you might feed the "reclaim" less than 16 pages, which might cause some performance degrade? But is this a common case? Should we even worry about this? I suppose for such new feature we should bring functionality first and then optimization if you have real performance data to show. > > > > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > Co-developed-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx> > > > Signed-off-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx> > > > Co-developed-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx> > > > Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx> > > > Cc: Sean Christopherson <seanjc@xxxxxxxxxx> > > > --- > > > > > > > [...] > > > > > +/** > > > + * sgx_do_epc_reclamation() - Perform reclamation for isolated EPC > > > pages. > > > + * @iso: List of isolated pages for reclamation > > > + * > > > + * Take a list of EPC pages and reclaim them to the enclave's private > > > shmem files. Do not > > > + * reclaim the pages that have been accessed since the last scan, and > > > move each of those pages > > > + * to the tail of its tracking LRU list. > > > + * > > > + * Limit the number of pages to be processed up to SGX_NR_TO_SCAN_MAX > > > per call in order to > > > + * degrade amount of IPI's and ETRACK's potentially required. > > > sgx_encl_ewb() does degrade a bit > > > + * among the HW threads with three stage EWB pipeline (EWB, ETRACK + > > > EWB and IPI + EWB) but not > > > + * sufficiently. Reclaiming one page at a time would also be > > > problematic as it would increase > > > + * the lock contention too much, which would halt forward progress. > > > > This is kinda optimization, correct? Is there any real performance data > > to > > justify this? > > The above sentences were there originally. This optimization was justified. I am talking about 16 -> 32. > > > If this optimization is useful, shouldn't we bring this > > optimization to the current sgx_reclaim_pages() instead, e.g., just > > increase > > SGX_NR_TO_SCAN (16) to SGX_NR_TO_SCAN_MAX (32)? > > > > SGX_NR_TO_SCAN_MAX might be designed earlier for other reasons I don't > know. Currently it is really the buffer size allocated in > sgx_reclaim_pages(). Both cgroup and ksgxd scan 16 pages a time.Maybe we > should just use SGX_NR_TO_SCAN. No _MAX needed. The point was to batch > reclamation to certain number to minimize impact of EWB pipeline. 16 was > the original design. > Please don't leave why you are trying to do this to the reviewers. If you don't know, then just drop this.