On Wed, Mar 10, 2021 at 07:44:39AM -0800, Dave Hansen wrote: > >>> + * node. > >>> + */ > >>> +static struct sgx_numa_node *sgx_numa_nodes; > >>> + > >>> +/* > >>> + * sgx_free_epc_page() uses this to find out the correct struct sgx_numa_node, > >>> + * to put the page in. > >>> + */ > >>> +static int sgx_section_to_numa_node_id[SGX_MAX_EPC_SECTIONS]; > >> > >> If this is per-section, why not put it in struct sgx_epc_section? > > > > Because struct sgx_epc_page does not contain a pointer to > > struct sgx_epc_section. > > Currently, we have epc_page->section. That's not changing. But, with > the free list moving from sgx_epc_section to sgx_numa_node, we need a > way to get from page->node, not just page->section. > > We can either add that to: > > struct sgx_epc_section { > ... > + struct sgx_numa_node *node; > } > > so we can do epc_page->section->node to find the epc_page's free list, > or we could just do: > > struct sgx_epc_page { > - unsigned int section; > + unsigned int node; > unsigned int flags; > struct sgx_encl_page *owner; > struct list_head list; > struct list_head numa_list; > }; > > and go from page->node directly. OK, I buy this, thanks. > >>> page = list_first_entry(&sgx_free_page_list, struct sgx_epc_page, list); > >>> + list_del_init(&page->numa_list); > >>> list_del_init(&page->list); > >>> sgx_nr_free_pages--; > >> > >> I would much rather prefer that this does what the real page allocator > >> does: kep the page on a single list. That list is maintained > >> per-NUMA-node. Allocations try local NUMA node structures, then fall > >> back to other structures (hopefully in a locality-aware fashion). > >> > >> I wrote you the loop that I want to see this implement in an earlier > >> review. This, basically: > >> > >> page = NULL; > >> nid = numa_node_id(); > >> while (true) { > >> page = __sgx_alloc_epc_page_from_node(nid); > >> if (page) > >> break; > >> > >> nid = // ... some search here, next_node_in()... > >> // check if we wrapped around: > >> if (nid == numa_node_id()) > >> break; > >> } > >> > >> There's no global list. You just walk around nodes trying to find one > >> with space. If you wrap around, you stop. > >> > >> Please implement this. If you think it's a bad idea, or can't, let's > >> talk about it in advance. Right now, it appears that my review comments > >> aren't being incorporated into newer versions. > > > > How I interpreted your earlier comments is that the fallback is unfair and > > this patch set version does fix that. > > > > I can buy the above allocation scheme, but I don't think this patch set > > version is a step backwards. The things done to struct sgx_epc_section > > are exactly what should be done to it. > > To me, it's a step backwards. It regresses in that it falls back to an > entirely non-NUMA aware allocation mechanism. The global list is > actually likely to be even worse than the per-section searches because > it has a global lock as opposed to the at per-section locks. It also > has the overhead of managing two lists instead of one. > > So, yes, it is *fair* in terms of NUMA node pressure. But being fair in > a NUMA-aware allocator by simply not doing NUMA at all is a regression. The code is structured now in a way that is trivial to remove the global list and move on to just node lists. I.e. nasty section lists have been wiped away. Refactoring global list out is a trivial step. That way this is a step forwards, even if having a global list would be step backwards:-) > > Implementation-wise you are asking me to squash 4/5 and 5/5 into a single > > patch, and remove global list. It's a tiny iteration from this patch > > version and I can do it. > > Sounds good. I'll dissolve your feedback and come with the new version, which I'll put out tomorrow. PS. If you don't here of me after you have given feedback to the next version, please ping privately. It looks like things are getting through again fast but better be sure than sorry... /Jarkko