----- "ville mattila" <ville.mattila@xxxxxxxxxxxxx> wrote: > > From: > > > > Dave Anderson <anderson@xxxxxxxxxx> > > > ... > > But your kernel shows cache_cache.buffer_size set to zero -- and the > > ASSIGN_SIZE(kmem_cache_s) above dutifully downsized the data structure > > size from 204 to zero. Later on, that size was used to allocate a > > kmem_cache buffer, which failed when a GETBUF() was called with a zero-size. > > > > I guess a check could be made above for a zero cache_cache.buffer_size, > > but why would that ever be? > > > > Try this: > > > > # crash --no_kmem_cache vmlinux vmcore > > > > which will allow you to get past the kmem_cache initialization. > > > > Then enter: > > > > crash> p cache_cache > > > > Does the "buffer_size" member really show zero? > > Yes it seems so! > initialize_task_state: using old defaults > <readmem: 8067a300, KVADDR, "fill_task_struct", 868, (ROE), 86e3f78> > addr: 8067a300 paddr: 67a300 cnt: 868 > STATE: TASK_RUNNING (PANIC) > > crash> p cache_cache > cache_cache = GETBUF(128 -> 0) > <readmem: 8067f1c0, KVADDR, "gdb_readmem_callback", 204, (ROE), 8ac00d8> > addr: 8067f1c0 paddr: 67f1c0 cnt: 204 > $3 = { > array = {0x0, 0x8067f1c4, 0x8067f1c4, 0x0, 0x0, 0x0, 0x0, 0x0, > 0xf7813e00, 0xf7849400, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, > batchcount = 0, > limit = 0, > shared = 0, > buffer_size = 0, > reciprocal_buffer_size = 0, > flags = 0, > num = 0, > gfporder = 0, > gfpflags = 60, > colour = 120, > colour_off = 8, > slabp_cache = 0x100, > slab_size = 16777216, > dflags = 0, > ctor = 0xf, > name = 0x0, > next = { > next = 0x0, > prev = 0x2 > }, > nodelists = {0x40} > } > FREEBUF(0) That's some serious corruption! > > > > BTW, you can work around the problem by commenting out the call > > to kmem_cache_downsize() in vm_init(). > > This workaround works ok. But even then, if you comment out the call to kmem_cache_downsize(), the kmem_cache_init() function could not have done anything useful because the "cache_cache.next.next" pointer is corrupted with a NULL, which points to the first of the chain of kmem_cache slab cache headers. I'm surprised it managed to continue without running into another roadblock -- did it display the "crash: unable to initialize kmem slab cache subsystem" error message? > > (And if you're using makedumpfile with excluded pages, hope that > > the problem I described above doesn't occur...) > > > We are not excluding files so this is not a big issue. Also > the --no_kmem_cache lets me open dump and let me do quite many things > already. Like I mentioned before, I could put a check in kmem_cache_downsize() to check for a zero buffer_size, but the odds of that happening are absurdly small. I suppose I could check whether the value is less than the kmem_cache.nodelists structure offset. Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility