On Mon 06-12-21 13:43:27, David Hildenbrand wrote: [...] > > Now practically speaking !node_online should not apear node_online (note > > I am attentionally avoiding to say offline and online as that has a > > completely different semantic) shouldn't really happen for some > > architectures. x86 should allocate pgdat for each possible node. I do > > not know what was the architecture in this case but we already have > > another report for x86 that remains unexplained. > > So we'd allocate the pgdat although all we want is just a zonelist. The > obvious alternative is to implement the fallback where reasonable -- for > example, in the page allocator. It knows the fallback order: > build_zonelists(). That's pretty much all we need the preferred_nid for. > > So just making prepare_alloc_pages()/node_zonelist() deal with a missing > pgdat could make sense as well. Something like: > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > index b976c4177299..2d2649e78766 100644 > --- a/include/linux/gfp.h > +++ b/include/linux/gfp.h > @@ -508,9 +508,14 @@ static inline int gfp_zonelist(gfp_t flags) > * > * For the case of non-NUMA systems the NODE_DATA() gets optimized to > * &contig_page_data at compile-time. > + * > + * If the node does not have a pgdat yet, returns the zonelist of the > + * first online node. > */ > static inline struct zonelist *node_zonelist(int nid, gfp_t flags) > { > + if (unlikely(!NODE_DATA(nid))) > + nid = first_online_node; > return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags); > } This is certainly possible. But it a) adds a branch to the hotpath and b) it doesn't solve any other potential dereference of garbage node. > But of course, there might be value in a proper node-aware fallback list > as we have in build_zonelists() -- but it remains questionable if the > difference for these corner cases would be relevant in practice. Only the platform knows the proper node topology and that includes memory less nodes. So they should be setting up a node properly and we shouldn't be dealing with this at the allocator nor any other code. > Further, if we could have thousands of nodes, we'd have to update each > and every one when building zone lists ... Why would that be a practical problem? > Removing the hotadd_new_pgdat() stuff does sound appealing, though. Yes our hotplug code could be simplified as well. All we really need is an arch code to initialize all the possible nodes. -- Michal Hocko SUSE Labs