On Mon, Dec 24, 2018 at 09:10:56AM +0100, Michal Hocko wrote: > On Sat 22-12-18 09:04:21, Nicholas Mc Guire wrote: > > On Fri, Dec 21, 2018 at 01:58:39PM -0800, David Rientjes wrote: > > > On Thu, 20 Dec 2018, Nicholas Mc Guire wrote: > > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > > index 871e41c..1c118d7 100644 > > > > --- a/mm/vmalloc.c > > > > +++ b/mm/vmalloc.c > > > > @@ -1258,7 +1258,7 @@ void __init vmalloc_init(void) > > > > > > > > /* Import existing vmlist entries. */ > > > > for (tmp = vmlist; tmp; tmp = tmp->next) { > > > > - va = kzalloc(sizeof(struct vmap_area), GFP_NOWAIT); > > > > + va = kzalloc(sizeof(*va), GFP_NOWAIT | __GFP_NOFAIL); > > > > va->flags = VM_VM_AREA; > > > > va->va_start = (unsigned long)tmp->addr; > > > > va->va_end = va->va_start + tmp->size; > > > > > > Hi Nicholas, > > > > > > You're right that this looks wrong because there's no guarantee that va is > > > actually non-NULL. __GFP_NOFAIL won't help in init, unfortunately, since > > > we're not giving the page allocator a chance to reclaim so this would > > > likely just end up looping forever instead of crashing with a NULL pointer > > > dereference, which would actually be the better result. > > > > > tried tracing the __GFP_NOFAIL path and had concluded that it would > > end in out_of_memory() -> panic("System is deadlocked on memory\n"); > > which also should point cleanly to the cause - but I´m actually not > > that sure if that trace was correct in all cases. > > No, we do not trigger the memory reclaim path nor the oom killer when > using GFP_NOWAIT. In fact the current implementation even ignores > __GFP_NOFAIL AFAICS (so I was wrong about the endless loop but I suspect > that we used to loop fpr __GFP_NOFAIL at some point in the past). The > patch simply doesn't have any effect. But the primary objection is that > the behavior might change in future and you certainly do not want to get > stuck in the boot process without knowing what is going on. Crashing > will tell you that quite obviously. Although I have hard time imagine > how that could happen in a reasonably configured system. I think most of the defensive structures are covering rare to almost impossible cases - but those are precisely the hard ones to understand if they do happen. > > > > You could do > > > > > > BUG_ON(!va); > > > > > > to make it obvious why we crashed, however. It makes it obvious that the > > > crash is intentional rather than some error in the kernel code. > > > > makes sense - that atleast makes it imediately clear from the code > > that there is no way out from here. > > How does it differ from blowing up right there when dereferencing flags? > It would be clear from the oops. The question is how soon does it blow-up if it were imediate then three is probably no real difference if there is some delay say due to the region affected by the NULL pointer not being imediately in use - it may be very hard to differenciate between an allocation failure and memory corruption so having a directly associated trace should be significantly simpler to understand - and you might actually not want a system to try booting if there are problems at this level. thx! hofrat