Re: crash-5.0: zero-size memory-allocation

Dave Anderson <anderson@xxxxxxxxxx> · Wed, 13 Jan 2010 09:09:45 -0500 (EST)

----- "ville mattila" <ville.mattila@xxxxxxxxxxxxx> wrote:

> > From:
> >
> > Dave Anderson <anderson@xxxxxxxxxx>
> >
> ...
> > But your kernel shows cache_cache.buffer_size set to zero -- and the
> > ASSIGN_SIZE(kmem_cache_s) above dutifully downsized the data structure
> > size from 204 to zero. Later on, that size was used to allocate a
> > kmem_cache buffer, which failed when a GETBUF() was called with a zero-size.
> >
> > I guess a check could be made above for a zero cache_cache.buffer_size,
> > but why would that ever be?
> >
> > Try this:
> >
> > # crash --no_kmem_cache vmlinux vmcore
> >
> > which will allow you to get past the kmem_cache initialization.
> >
> > Then enter:
> >
> > crash> p cache_cache
> >
> > Does the "buffer_size" member really show zero?
> 
> Yes it seems so!
> initialize_task_state: using old defaults
> <readmem: 8067a300, KVADDR, "fill_task_struct", 868, (ROE), 86e3f78>
> addr: 8067a300 paddr: 67a300 cnt: 868
> STATE: TASK_RUNNING (PANIC)
> 
> crash> p cache_cache
> cache_cache = GETBUF(128 -> 0)
> <readmem: 8067f1c0, KVADDR, "gdb_readmem_callback", 204, (ROE), 8ac00d8>
> addr: 8067f1c0 paddr: 67f1c0 cnt: 204
> $3 = {
> array = {0x0, 0x8067f1c4, 0x8067f1c4, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0xf7813e00, 0xf7849400, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
> batchcount = 0,
> limit = 0,
> shared = 0,
> buffer_size = 0,
> reciprocal_buffer_size = 0,
> flags = 0,
> num = 0,
> gfporder = 0,
> gfpflags = 60,
> colour = 120,
> colour_off = 8,
> slabp_cache = 0x100,
> slab_size = 16777216,
> dflags = 0,
> ctor = 0xf,
> name = 0x0,
> next = {
> next = 0x0,
> prev = 0x2
> },
> nodelists = {0x40}
> }
> FREEBUF(0)

That's some serious corruption!

> >
> > BTW, you can work around the problem by commenting out the call
> > to kmem_cache_downsize() in vm_init().
> 
> This workaround works ok.

But even then, if you comment out the call to kmem_cache_downsize(),
the kmem_cache_init() function could not have done anything useful
because the "cache_cache.next.next" pointer is corrupted with a NULL, 
which points to the first of the chain of kmem_cache slab cache headers.
I'm surprised it managed to continue without running into another
roadblock -- did it display the "crash: unable to initialize kmem
slab cache subsystem" error message?

> > (And if you're using makedumpfile with excluded pages, hope that
> > the problem I described above doesn't occur...)
> >
> We are not excluding files so this is not a big issue. Also
> the --no_kmem_cache lets me open dump and let me do quite many things
> already.

Like I mentioned before, I could put a check in kmem_cache_downsize()
to check for a zero buffer_size, but the odds of that happening are
absurdly small.  I suppose I could check whether the value is less
than the kmem_cache.nodelists structure offset.

Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility