On Fri, 22 Nov 2019 10:53:22 -0500 Qian Cai <cai@xxxxxx> wrote: > On Fri, 2019-11-22 at 23:28 +0800, Pengfei Li wrote: > > On Fri, 22 Nov 2019 15:25:00 +0800 > > "lixinhai.lxh@xxxxxxxxx" <lixinhai.lxh@xxxxxxxxx> wrote: > > > > > On 2019-11-21 at 23:17 Pengfei Li wrote: > > > > Motivation > > > > ---------- > > > > Currently if we want to iterate through all the nodes we have to > > > > traverse all the zones from the zonelist. > > > > > > > > So in order to reduce the number of loops required to traverse > > > > node, this series of patches modified the zonelist to nodelist. > > > > > > > > Two new macros have been introduced: > > > > 1) for_each_node_nlist > > > > 2) for_each_node_nlist_nodemask > > > > > > > > > > > > Benefit > > > > ------- > > > > 1. For a NUMA system with N nodes, each node has M zones, the > > > > number of loops is reduced from N*M times to N times when > > > > traversing node. > > > > > > > > > > It looks to me that we don't really have system which has N nodes > > > and each node with M zones in its address range. > > > We may have systems which has several nodes, but only the first > > > node has all zone types, other nodes only have NORMAL zone. > > > (Evenly distribute the !NORMAL zones on all nodes is not > > > reasonable, as those zones have limited size) > > > So iterate over zones to reach nodes should at N level, not M*N > > > level. > > > > > > > Thanks for your comments. > > > > In the case you said, the number of loops required to traverse all > > nodes is similar to traversing all zones. > > > > I have two main reasons to explain that this series of patches is > > beneficial. > > > > 1. When node has more than one zone, it will take fewer cycles to > > traverse all nodes. (for example, ZONE_MOVABLE?) > > ZONE_MOVABLE is broken for ages (non-movable allocations are there > all the time last time I tried) which indicates there is very few > people care about it, so it is rather weak to use that as a > justification for the churns it might cause. > Thanks for your comments. Yes, if node has only NORMAL ZONE, then the zonelist is actually a nodelist. This series of patches really only benefits nodes with more than one zone. > > > > 2. Using zonelist to traverse all nodes is inefficient, pgdat must > > be obtained indirectly via zone->zone_pgdat, and additional > > judgment must be made. > > > > E.g > > 1) Using zonelist to traverse all nodes > > > > last_pgdat = NULL; > > for_each_zone_zonelist(zone, xxx) { > > pgdat = zone->zone_pgdat; > > if (pgdat == last_pgdat) > > continue; > > > > last_pgdat = pgdat; > > do_something(pgdat); > > } > > > > 2) Using nodelist to traverse all nodes > > > > for_each_node_nodelist(node, xxx) { > > do_something(NODE_INFO(node)); > > } > > -- Pengfei