On Thu, 14 Sep 2023 19:54:57 -0400 Gregory Price <gourry.memverge@xxxxxxxxx> wrote: > The partial-interleave mempolicy implements interleave on an I'm not sure 'partial' really conveys what is going on here. Weighted, or uneven-interleave maybe? > allocation interval. The default node is the local node, for > which N pages will be allocated before an interleave pass occurs. > > For example: > nodes=0,1,2 > interval=3 > cpunode=0 > > Over 10 consecutive allocations, the following nodes will be selected: > [0,0,0,1,2,0,0,0,1,2] > > In this example, there is a 60%/20%/20% distribution of memory. > > Using this mechanism, it becomes possible to define an approximate > distribution percentage of memory across a set of nodes: > > local_node% : interval/((nr_nodes-1)+interval-1) > other_node% : (1-local_node%)/(nr_nodes-1) I'd like to see more discussion here of why you would do this... A few trivial bits inline, Jonathan ... > +static unsigned long alloc_pages_bulk_array_partial_interleave(gfp_t gfp, > + struct mempolicy *pol, unsigned long nr_pages, > + struct page **page_array) > +{ > + nodemask_t nodemask = pol->nodes; > + unsigned long nr_pages_main; > + unsigned long nr_pages_other; > + unsigned long total_cycle; > + unsigned long delta; > + unsigned long interval; > + int allocated = 0; > + int start_nid; > + int nnodes; > + int prev, next; > + int i; > + > + /* This stabilizes nodes on the stack incase pol->nodes changes */ > + barrier(); > + > + nnodes = nodes_weight(nodemask); > + start_nid = numa_node_id(); > + > + if (!node_isset(start_nid, nodemask)) > + start_nid = first_node(nodemask); > + > + if (nnodes == 1) { > + allocated = __alloc_pages_bulk(gfp, start_nid, > + NULL, nr_pages_main, > + NULL, page_array); > + return allocated; return __alloc_pages_bulk(...) > + } > + /* We don't want to double-count the main node in calculations */ > + nnodes--; > + > + interval = pol->part_int.interval; > + total_cycle = (interval + nnodes); excess brackets. Same in various other places. > + /* Number of pages on main node: (cycles*interval + up to interval) */ > + nr_pages_main = ((nr_pages / total_cycle) * interval); > + nr_pages_main += (nr_pages % total_cycle % (interval + 1)); > + /* Number of pages on others: (remaining/nodes) + 1 page if delta */ > + nr_pages_other = (nr_pages - nr_pages_main) / nnodes; > + nr_pages_other /= nnodes; > + /* Delta is number of pages beyond interval up to full cycle */ > + delta = nr_pages - (nr_pages_main + (nr_pages_other * nnodes)); > + > + /* start by allocating for the main node, then interleave rest */ > + prev = start_nid; > + allocated = __alloc_pages_bulk(gfp, start_nid, NULL, nr_pages_main, > + NULL, page_array); > + for (i = 0; i < nnodes; i++) { > + int pages = nr_pages_other + (delta-- ? 1 : 0); > + > + next = next_node_in(prev, nodemask); > + if (next < MAX_NUMNODES) > + prev = next; > + allocated += __alloc_pages_bulk(gfp, next, NULL, pages, > + NULL, page_array); > + } > + > + return allocated; > +} > +