On 18/06/2024 12:53 am, Huang, Haitao wrote: > From: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx> > > Previous patches have implemented all infrastructure needed for > per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC > pages are still tracked in the global LRU as sgx_epc_page_lru() always > returns reference to the global LRU. > > Change sgx_epc_page_lru() to return the LRU of the cgroup in which the > given EPC page is allocated. > > This makes all EPC pages tracked in per-cgroup LRUs and the global > reclaimer (ksgxd) will not be able to reclaim any pages from the global > LRU. However, in cases of over-committing, i.e., the sum of cgroup > limits greater than the total capacity, cgroups may never reclaim but > the total usage can still be near the capacity. Therefore a global > reclamation is still needed in those cases and it should be performed > from the root cgroup. > > Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup > when cgroup is enabled. Similar to sgx_cgroup_reclaim_pages(), return > the next cgroup so callers can use it as the new starting node for next > round of reclamation if needed. > > Also update sgx_can_reclaim_global(), to check emptiness of LRUs of all > cgroups when EPC cgroup is enabled, otherwise only check the global LRU. > > Finally, change sgx_reclaim_direct(), to check and ensure there are free > pages at cgroup level so forward progress can be made by the caller. Reading above, it's not clear how the _new_ global reclaim works with multiple LRUs. E.g., the current global reclaim essentially treats all EPC pages equally when scanning those pages. From the above, I don't see how this is achieved in the new global reclaim. The changelog should: 1) describe the how does existing global reclaim work, and then describe how to achieve the same beahviour in the new global reclaim which works with multiple LRUs; 2) If there's any behaviour difference between the "existing" vs the "new" global reclaim, the changelog should point out the difference, and explain why such difference is OK. > > With these changes, the global reclamation and per-cgroup reclamation > both work properly with all pages tracked in per-cgroup LRUs. > [...] > > -static void sgx_reclaim_pages_global(struct mm_struct *charge_mm) > +static struct misc_cg *sgx_reclaim_pages_global(struct misc_cg *next_cg, > + struct mm_struct *charge_mm) > { > + if (IS_ENABLED(CONFIG_CGROUP_MISC)) > + return sgx_cgroup_reclaim_pages(misc_cg_root(), next_cg, charge_mm); > + > sgx_reclaim_pages(&sgx_global_lru, charge_mm); > + return NULL; > } > > /* > @@ -414,12 +443,35 @@ static void sgx_reclaim_pages_global(struct mm_struct *charge_mm) > */ > void sgx_reclaim_direct(void) > { > + struct sgx_cgroup *sgx_cg = sgx_get_current_cg(); > + struct misc_cg *cg = misc_from_sgx(sgx_cg);