Re: kernel crash when using libnuma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 06, 2012 at 03:57:52AM -0500, Trevor Kramer wrote:
> I have a program which can use libnuma to allocate memory using
> numa_alloc_onnode() or using malloc. When running in malloc mode
> everything works fine but when running under libnuma mode I get
> consistent kernel panics with the following traces. This only occurs
> when multiple threads are running. Has anyone seen this before or have
> any recommendations on how to debug further?


Looks like a THP problem.

For RHEL issues you normally need to talk to RedHat, these lists
are more for mainline.

-Andi

> 
> crash> bt
> PID: 62333  TASK: ffff883ff5698b40  CPU: 17  COMMAND: "test"
>  #0 [ffff883ff58378f0] machine_kexec at ffffffff810310cb
>  #1 [ffff883ff5837950] crash_kexec at ffffffff810b6392
>  #2 [ffff883ff5837a20] oops_end at ffffffff814de670
>  #3 [ffff883ff5837a50] die at ffffffff8100f2eb
>  #4 [ffff883ff5837a80] do_trap at ffffffff814ddf64
>  #5 [ffff883ff5837ae0] do_invalid_op at ffffffff8100ceb5
>  #6 [ffff883ff5837b80] invalid_op at ffffffff8100bf5b
>     [exception RIP: split_huge_page+2021]
>     RIP: ffffffff8116c605  RSP: ffff883ff5837c38  RFLAGS: 00010297
>     RAX: 0000000000000001  RBX: ffff880ff704bc38  RCX: 000000000000fe9e
>     RDX: 0000000000000000  RSI: 0000000000000046  RDI: 0000000000000246
>     RBP: ffff883ff5837d08   R8: 0000000000000000   R9: 0000000000000004
>     R10: 0000000000000001  R11: ffff880ff6fb7906  R12: ffff880ff84b7aa8
>     R13: fffffffffffffff2  R14: ffffea006c34c000  R15: ffffea006c34c000
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #7 [ffff883ff5837c30] split_huge_page at ffffffff8116c5aa
>  #8 [ffff883ff5837d10] __split_huge_page_pmd at ffffffff8116c6d1
>  #9 [ffff883ff5837d40] unmap_vmas at ffffffff8113559e
> #10 [ffff883ff5837e80] unmap_region at ffffffff8113cce1
> #11 [ffff883ff5837ef0] do_munmap at ffffffff8113d3a6
> #12 [ffff883ff5837f50] sys_munmap at ffffffff8113d4e6
> #13 [ffff883ff5837f80] system_call_fastpath at ffffffff8100b172
>     RIP: 00007f12d33154d2  RSP: 00007f12884731f8  RFLAGS: 00010283
>     RAX: 000000000000000b  RBX: ffffffff8100b172  RCX: 0000000000000020
>     RDX: 0000000000000000  RSI: 00000000003fe560  RDI: 00007f129f460000
>     RBP: 00000000003fe560   R8: 00007f1288475300   R9: 00007f1288475300
>     R10: 0000003d9c0eb3b0  R11: 0000000000000246  R12: 0000003d9c0f1fc0
>     R13: 0000003d9c0f0e00  R14: 00007f129f460000  R15: 00007f129f460000
>     ORIG_RAX: 000000000000000b  CS: 0033  SS: 002b
> 
> The machine is running RedHat Enterprise Server 6 with
> 2.6.32-220.4.1.el6.x86_64.
> 
> Thanks,
> 
> Trevor
> --
> To unsubscribe from this list: send the line "unsubscribe linux-numa" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [Devices]

  Powered by Linux