glusterfsd crash due to page allocation failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

We've recently upgraded from gluster 3.6.6 to 3.7.6 and have started encountering dmesg page allocation errors (stack trace is appended).

It appears that glusterfsd now sometimes fills up the cache completely and crashes with a page allocation failure. I *believe* it mainly happens when copying lots of new data to the system, running a 'find', or similar. Hosts are all Scientific Linux 6.6 and these errors occur consistently on two separate gluster pools.

Has anyone else seen this issue and are there any known fixes for it via sysctl kernel parameters or other means?

Please let me know of any other diagnostic information that would help.

Thanks,
Patrick


[1458118.134697] glusterfsd: page allocation failure. order:5, mode:0x20
[1458118.134701] Pid: 6010, comm: glusterfsd Not tainted 2.6.32-573.3.1.el6.x86_64 #1
[1458118.134702] Call Trace:
[1458118.134714]  [<ffffffff8113770c>] ? __alloc_pages_nodemask+0x7dc/0x950
[1458118.134728]  [<ffffffffa0321800>] ? mlx4_ib_post_send+0x680/0x1f90 [mlx4_ib]
[1458118.134733]  [<ffffffff81176e92>] ? kmem_getpages+0x62/0x170
[1458118.134735]  [<ffffffff81177aaa>] ? fallback_alloc+0x1ba/0x270
[1458118.134736]  [<ffffffff811774ff>] ? cache_grow+0x2cf/0x320
[1458118.134738]  [<ffffffff81177829>] ? ____cache_alloc_node+0x99/0x160
[1458118.134743]  [<ffffffff8145f732>] ? pskb_expand_head+0x62/0x280
[1458118.134744]  [<ffffffff81178479>] ? __kmalloc+0x199/0x230
[1458118.134746]  [<ffffffff8145f732>] ? pskb_expand_head+0x62/0x280
[1458118.134748]  [<ffffffff8146001a>] ? __pskb_pull_tail+0x2aa/0x360
[1458118.134751]  [<ffffffff8146f389>] ? harmonize_features+0x29/0x70
[1458118.134753]  [<ffffffff8146f9f4>] ? dev_hard_start_xmit+0x1c4/0x490
[1458118.134758]  [<ffffffff8148cf8a>] ? sch_direct_xmit+0x15a/0x1c0
[1458118.134759]  [<ffffffff8146ff68>] ? dev_queue_xmit+0x228/0x320
[1458118.134762]  [<ffffffff8147665d>] ? neigh_connected_output+0xbd/0x100
[1458118.134766]  [<ffffffff814abc67>] ? ip_finish_output+0x287/0x360
[1458118.134767]  [<ffffffff814abdf8>] ? ip_output+0xb8/0xc0
[1458118.134769]  [<ffffffff814ab04f>] ? __ip_local_out+0x9f/0xb0
[1458118.134770]  [<ffffffff814ab085>] ? ip_local_out+0x25/0x30
[1458118.134772]  [<ffffffff814ab580>] ? ip_queue_xmit+0x190/0x420
[1458118.134773]  [<ffffffff81137059>] ? __alloc_pages_nodemask+0x129/0x950
[1458118.134776]  [<ffffffff814c0c54>] ? tcp_transmit_skb+0x4b4/0x8b0
[1458118.134778]  [<ffffffff814c319a>] ? tcp_write_xmit+0x1da/0xa90
[1458118.134779]  [<ffffffff81178cbd>] ? __kmalloc_node+0x4d/0x60
[1458118.134780]  [<ffffffff814c3a80>] ? tcp_push_one+0x30/0x40
[1458118.134782]  [<ffffffff814b410c>] ? tcp_sendmsg+0x9cc/0xa20
[1458118.134786]  [<ffffffff8145836b>] ? sock_aio_write+0x19b/0x1c0
[1458118.134788]  [<ffffffff814581d0>] ? sock_aio_write+0x0/0x1c0
[1458118.134791]  [<ffffffff8119169b>] ? do_sync_readv_writev+0xfb/0x140
[1458118.134797]  [<ffffffff810a14b0>] ? autoremove_wake_function+0x0/0x40
[1458118.134801]  [<ffffffff8123e92f>] ? selinux_file_permission+0xbf/0x150
[1458118.134804]  [<ffffffff812316d6>] ? security_file_permission+0x16/0x20
[1458118.134806]  [<ffffffff81192746>] ? do_readv_writev+0xd6/0x1f0
[1458118.134807]  [<ffffffff811928a6>] ? vfs_writev+0x46/0x60
[1458118.134809]  [<ffffffff811929d1>] ? sys_writev+0x51/0xd0
[1458118.134812]  [<ffffffff810e88ae>] ? __audit_syscall_exit+0x25e/0x290
[1458118.134816]  [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux