On Thu, 2005-10-27 at 13:17 -0400, Dave Anderson wrote: > Badari Pulavarty wrote: > > > That debug output certainly seems to pinpoint the issue at hand, > > doesn't it? > > > Very interesting... > > > > > > What's strange is that the usage of the cpu_pda[i].data_offset by > > the > > > per_cpu() macro in "include/asm-x86_64/percpu.h" is unchanged. > > > > > > It's probably something very simple going on here, but I don't > > have > > > any more ideas at this point. > > > > This is the reply I got from Andi Kleen.. > > > > -------- Forwarded Message -------- > > From: Andi Kleen <ak@xxxxxxx> > > To: Badari Pulavarty <pbadari@xxxxxxxxxx> > > Subject: Re: cpu_pda->data_offset changed recently ? > > Date: Thu, 27 Oct 2005 16:58:54 +0200 > > On Thursday 27 October 2005 16:53, Badari Pulavarty wrote: > > > Hi Andi, > > > > > > I am trying to fix "crash" utility to make it work on 2.6.14-rc5. > > > (Its running fine on 2.6.10). It looks like crash utility reads > > > and uses cpu_pda->data_offset values. It looks like there is a > > > change between 2.6.10 & 2.6.14-rc5 which is causing "data_offset" > > > to be huge values - which is causing "crash" to break. > > > > > > I added printk() to find out why ? As you can see from following > > > what changed - Is this expected ? Please let me know. > > > > bootmem used to allocate from the end of the direct mapping on NUMA > > systems. Now it starts at the beginning, often before the > > kernel .text. > > This means it is negative. Perfectly legitimate. crash just has to > > handle it. > > > > -Andi > > > > -- > > > That's what I thought it looked like, although the > x8664_pda.data_offset > field is an "unsigned long". Anyway, if you take any of the > per_cpu__xxx > symbols from the 2.6.14 kernel, subtract a cpu data_offset, does it > come up > with a legitimate virtual address? Unfortunately, I don't know x86-64 kernel virtual address space well enough to answer your question. My understanding is x86-64 kernel addresses look something like: addr: ffffffff80101000 But now (2.6.14-rc5) I do see address like: pgdat: 0xffff81000000e000 which are causing read problems. crash: read error: kernel virtual address: ffff81000000fa90 type: "pglist_data node_next" I am not sure what these address are and if they are valid. Is there a way to verify these addresses, through gdb or /dev/kmem or something like that ? Thanks, Badari