On 07/14/2017 10:00 AM, Michal Hocko wrote: > From: Michal Hocko <mhocko@xxxxxxxx> > > build_zonelists gradually builds zonelists from the nearest to the most > distant node. As we do not know how many populated zones we will have in > each node we rely on the _zoneref to terminate initialized part of the > zonelist by a NULL zone. While this is functionally correct it is quite > suboptimal because we cannot allow updaters to race with zonelists > users because they could see an empty zonelist and fail the allocation > or hit the OOM killer in the worst case. > > We can do much better, though. We can store the node ordering into an > already existing node_order array and then give this array to > build_zonelists_in_node_order and do the whole initialization at once. > zonelists consumers still might see halfway initialized state but that > should be much more tolerateable because the list will not be empty and > they would either see some zone twice or skip over some zone(s) in the > worst case which shouldn't lead to immediate failures. > > This patch alone doesn't introduce any functional change yet, though, it > is merely a preparatory work for later changes. > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> I've collected the fold-ups from this thread and looked at the result as single patch. Sems OK, just two things: - please rename variable "i" in build_zonelists() to e.g. "nr_nodes" - the !CONFIG_NUMA variant of build_zonelists() won't build, because it doesn't declare nr_zones variable -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>