Re: [Bug #13319] Page allocation failures with b43 and p54usb

David Rientjes <rientjes@xxxxxxxxxx> · Tue, 9 Jun 2009 00:54:44 -0700 (PDT)

On Tue, 9 Jun 2009, Pekka Enberg wrote:

> Hi Mel,
> 
> On Mon, 2009-06-08 at 15:12 +0100, Mel Gorman wrote:
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 65ffda5..b5acf18 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -1565,6 +1565,8 @@ new_slab:
> > >  		c->page = new;
> > >  		goto load_freelist;
> > >  	}
> > > +	printk(KERN_WARNING "SLUB: unable to satisfy allocation for cache %s (size=%d, node=%d, gfp=%x)\n",
> > > +		s->name, s->size, node, gfpflags);
> > 
> > size could be almost anything here for a casual reader. You are
> > outputting the size of the object plus its metadata so the name should
> > reflect that. I think it would be better to output objsize= and the
> > object size without the metadata overhead. What do you think?
> > 
> > In addition, include how many objects there are per-slab and include what
> > the order is being passed to the page allocator when allocating new slabs.
> > Would that be enough to determine if fallback-to-smaller orders occured?
> 
> So how about something like this then?
> 

Larry reported this stack trace:

kernel: git: page allocation failure. order:1, mode:0x4020
kernel: Pid: 3707, comm: git Not tainted 2.6.30-rc1-wl #115
kernel: Call Trace:
kernel:  [<ffffffff80292f84>] __alloc_pages_internal+0x43d/0x45d
kernel:  [<ffffffff802b2383>] alloc_pages_current+0xbe/0xc6
kernel:  [<ffffffff802b66a4>] new_slab+0xcf/0x28b

That's in the order fallback for new slab allocations; so this cache must 
have oo_order(s->min) of 1.

To diagnose whether its object size dictates a >0 slab order, you could 
enable CONFIG_SLUB_STATS (it's disabled in his .config) and check which 
/sys/kernel/slab/cache/order_fallback increased.  Once you have identified 
the cache, you can get this information via 
/sys/kernel/slab/cache/{objsize,order,size}.  I think this is what 
Christoph was getting at.

You could even boot with `slub_nomerge' to determine whether cache merging 
was the issue where the cache under consideration was unnecessarily merged 
with one that requires larger higher order minimums.

I don't quite understand how its necessary to print the partial lists for 
each node, they should be exhausted if we're allocating a new slab if the 
node doesn't matter (and can't in Larry's case, he only has one).
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html