Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 13-06-13 16:57:23, Richard Weinberger wrote:
> Am 13.06.2013 16:45, schrieb Richard Weinberger:
> >Am 13.06.2013 16:39, schrieb Michal Hocko:
> >>On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
> >>>Am 13.06.2013 15:32, schrieb Michal Hocko:
> >>>>Ohh and could you post the config please? Sorry should have asked
> >>>>earlier.
> >>>
> >>>See attachment.
> >>
> >>Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
> >>help too catch the problem earlier.
> >
> >OK
> >
> >>>>On Thu 13-06-13 15:29:08, Michal Hocko wrote:
> >>>>>
> >>>>>On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
> >>>>>[...]
> >>>>>>All code
> >>>>>>========
> >>>>>>    0:   89 50 08                mov    %edx,0x8(%rax)
> >>>>>>    3:   48 89 d1                mov    %rdx,%rcx
> >>>>>>    6:   0f 1f 40 00             nopl   0x0(%rax)
> >>>>>>    a:   49 8b 04 24             mov    (%r12),%rax
> >>>>>>    e:   48 89 c2                mov    %rax,%rdx
> >>>>>>   11:   48 c1 e8 38             shr    $0x38,%rax
> >>>>>>   15:   83 e0 03                and    $0x3,%eax
> >>>>>                    nid = page_to_nid
> >>>>>>   18:   48 c1 ea 3a             shr    $0x3a,%rdx
> >>>>>                    zid = page_zonenum
> >>
> >>Ohh, I am wrong here. rdx should be nid and eax the zid.
> >>
> >>>>>
> >>>>>>   1c:   48 69 c0 38 01 00 00    imul   $0x138,%rax,%rax
> >>>>>>   23:   48 03 84 d1 e0 02 00    add    0x2e0(%rcx,%rdx,8),%rax
> >>>>>                    &memcg->nodeinfo[nid]->zoneinfo[zid]
> >>>>>
> >>>>>>   2a:   00
> >>>>>>   2b:*  48 3b 58 70             cmp    0x70(%rax),%rbx     <-- trapping instruction
> >>>>>
> >>>>>OK, so this maps to:
> >>>>>         if (unlikely(lruvec->zone != zone)) <<<
> >>>>>                 lruvec->zone = zone;
> >>>>>
> >>>>>>[35355.883056] RSP: 0000:ffff88003d523aa8  EFLAGS: 00010002
> >>>>>>[35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
> >>>>>>[35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
> >>>>>>[35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
> >>>>>>[35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
> >>>>>>[35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
> >>>>>>[35355.883056] FS:  0000000000000000(0000) GS:ffff88003fd80000(0000)
> >>>>>
> >>>>>RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
> >>>>>contain an address at an offset from ffff88003e04a800 But there is 0x138 there
> >>>>>instead.
> >>
> >>Hmm, now that I am looking at the registers again. RDX which should be
> >>nid seems to be quite big. It says this is node 32. Does the machine
> >>have really so many NUMA nodes?
> >
> >No. It's a KVM guest with two CPUs. Nothing special.
> >qemu command line:
> >qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
> >nic,model=e1000 -smp 4

OK, then something probably overwrites page->flags. I would be more
inclined to blame some other code ;)
Maybe DEBUG_VM will start shouting earlier
 
> Errr, I meant four CPUs. :)

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]