On Wed, 2024-02-21 at 00:44 -0600, Haitao Huang wrote: > [...] > > > > Here the @nr_to_scan is reduced by the number of pages that are > > isolated, but > > not actually reclaimed (which is reflected by @cnt). > > > > IIUC, looks you want to make this function do "each cycle" as what you > > mentioned > > in the v8 [1]: > > > > I tested with that approach and found we can only target number of > > pages > > attempted to reclaim not pages actually reclaimed due to the > > uncertainty > > of how long it takes to reclaim pages. Besides targeting number of > > scanned pages for each cycle is also what the ksgxd does. > > > > If we target actual number of pages, sometimes it just takes too long. > > I > > saw more timeouts with the default time limit when running parallel > > selftests. > > > > I am not sure what does "sometimes it just takes too long" mean, but > > what I am > > thinking is you are trying to do some perfect but yet complicated code > > here. > > I think what I observed was that the try_charge() would block too long > before getting chance of schedule() to yield, causing more timeouts than > necessary. > I'll do some re-test to be sure. Looks this is a valid information that can be used to justify whatever you are implementing in the EPC cgroup reclaiming function(s). > > > > > For instance, I don't think selftest reflect the real workload, and I > > believe > > adjusting the limit of a given EPC cgroup shouldn't be a frequent > > operation, > > thus it is acceptable to use some easy-maintain code but less perfect > > code. > > > > Here I still think having @nr_to_scan as a pointer is over-complicated. > > For > > example, we can still let sgx_reclaim_pages() to always scan > > SGX_NR_TO_SCAN > > pages, but give up when there's enough pages reclaimed or when the EPC > > cgroup > > and its descendants have been looped: > > > > unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root) > > { > > unsigned int cnt = 0; > > ... > > > > css_for_each_descendant_pre(pos, css_root) { > > ... > > epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); > > cnt += sgx_reclaim_pages(&epc_cg->lru); > > > > if (cnt >= SGX_NR_TO_SCAN) > > break; > > } > > > > ... > > return cnt; > > } > > > > Yeah it may reclaim more than SGX_NR_TO_SCAN when the loop actually > > reaches any > > descendants, but that should be rare and we don't care that much, do we? > > > I assume you meant @cnt here to be number of pages actually reclaimed. Yes. > This could cause sgx_epc_cgroup_reclaim_pages() block too long as @cnt > may always be zero (all pages are too young) and you have to loop all > descendants. I am not sure whether this is a valid point. For example, your change in patch 10 "x86/sgx: Add EPC reclamation in cgroup try_charge()" already loops all descendants in below code: + if (sgx_epc_cgroup_lru_empty(epc_cg->cg)) + return -ENOMEM; Anyway, I can see all these can be justification to your design/implementation. My point is please put these justification in changelog/comments so that we can actually understand. Makes sense?