Re: Kernel memory allocation oops Centos 7

"Bond, Darryl" <dbond@xxxxxxxxxxxxx> · Fri, 21 Nov 2014 01:16:27 +0000

Andrey,
The patches seem to be against infiniband drivers.
Would I get any value from trying the elrepo 3.17.3 kernel to hopefully pick up the compaction changes?

Regards
Darryl

________________________________________
From: Andrey Korolyov <andrey@xxxxxxx>
Sent: Friday, 21 November 2014 8:27 AM
To: Bond, Darryl
Subject: Re:  Kernel memory allocation oops Centos 7

Replying off-list:

please check ongoing thread there:
http://marc.info/?l=linux-mm&m=141643863320754&w=2 and, if possible,
check proposed patches against your env. Your case a bit different
because you are using bnx2x and not ipoib which was suggested as a
potential troublemaker. Sorry, just clicked on the google invitation
while scratching my eye, please ignore it (blaming Gmail interface).

On Fri, Nov 21, 2014 at 1:10 AM, Bond, Darryl <dbond@xxxxxxxxxxxxx> wrote:
> Brief outline:
>
> 6 Node production  cluster. Each node Dell R610, 8x1.4TB SAS Disks, Samsung M.2 PCIe SSD for journals, 32GB RAM, Broadcom 10G interfaces.
>
> Ceph 0.80.7-0.el7.centos from the ceph repositories.
>
>
> About 10 times per day, each node will oops with the following message:
>
> An example:
>
> Nov 21 07:07:50 ceph14-04 kernel: warn_alloc_failed: 366 callbacks suppressed
> Nov 21 07:07:50 ceph14-04 kernel: swapper/4: page allocation failure: order:2, mode:0x104020
> Nov 21 07:07:50 ceph14-04 kernel: kswapd0: page allocation failure: order:2, mode:0x104020
> Nov 21 07:07:50 ceph14-04 kernel: CPU: 5 PID: 176 Comm: kswapd0 Not tainted 3.10.0-123.9.3.el7.x86_64 #1
> Nov 21 07:07:50 ceph14-04 kernel: Hardware name: Dell Inc. PowerEdge R620/01W23F, BIOS 2.2.2 01/16/2014
> Nov 21 07:07:50 ceph14-04 kernel: ceph-osd: page allocation failure: order:2, mode:0x104020
> Nov 21 07:07:50 ceph14-04 kernel: systemd-journal: page allocation failure: order:2, mode:0x104020
> Nov 21 07:07:50 ceph14-04 kernel: CPU: 9 PID: 704 Comm: systemd-journal Not tainted 3.10.0-123.9.3.el7.x86_64 #1
> Nov 21 07:07:50 ceph14-04 kernel: CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.10.0-123.9.3.el7.x86_64 #1
> Nov 21 07:07:50 ceph14-04 kernel: Hardware name: Dell Inc. PowerEdge R620/01W23F, BIOS 2.2.2 01/16/2014
> Nov 21 07:07:50 ceph14-04 kernel:  0000000000104020 000000005835f665 ffff88080f0a3a00 ffffffff815e239b
> Nov 21 07:07:50 ceph14-04 kernel: ceph-osd: page allocation failure: order:2, mode:0x104020
> Nov 21 07:07:50 ceph14-04 kernel: Hardware name: Dell Inc. PowerEdge R620/01W23F, BIOS 2.2.2 01/16/2014
> Nov 21 07:07:50 ceph14-04 kernel:  0000000000104020
> Nov 21 07:07:50 ceph14-04 kernel: CPU: 0 PID: 7453 Comm: ceph-osd Not tainted 3.10.0-123.9.3.el7.x86_64 #1
> Nov 21 07:07:50 ceph14-04 kernel:  ffff88080f0a3a90
> Nov 21 07:07:50 ceph14-04 kernel:  0000000000104020
> Nov 21 07:07:50 ceph14-04 kernel:  000000009c9142fd
> Nov 21 07:07:50 ceph14-04 kernel: Hardware name: Dell Inc. PowerEdge R620/01W23F, BIOS 2.2.2 01/16/2014
>
> or another example:
>
> Nov 20 09:03:09 ceph14-06 kernel: warn_alloc_failed: 3803 callbacks suppressed
> Nov 20 09:03:09 ceph14-06 kernel: swapper/11: page allocation failure: order:2, mode:0x104020
> Nov 20 09:03:09 ceph14-06 kernel: CPU: 11 PID: 0 Comm: swapper/11 Not tainted 3.10.0-123.9.3.el7.x86_64 #1
> Nov 20 09:03:09 ceph14-06 kernel: Hardware name: Dell Inc. PowerEdge R620/01W23F, BIOS 2.2.2 01/16/2014
> Nov 20 09:03:09 ceph14-06 kernel:  0000000000104020 dbf4eb51672ffc35 ffff88080f163a00 ffffffff815e239b
> Nov 20 09:03:09 ceph14-06 kernel:  ffff88080f163a90 ffffffff81147340 0000000000000002 ffff88080f163a50
> Nov 20 09:03:09 ceph14-06 kernel:  ffff88082ffd7e80 ffff88082ffd7e80 0000000000000002 dbf4eb51672ffc35
> Nov 20 09:03:09 ceph14-06 kernel: Call Trace:
> Nov 20 09:03:09 ceph14-06 kernel:  <IRQ>  [<ffffffff815e239b>] dump_stack+0x19/0x1b
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff81147340>] warn_alloc_failed+0x110/0x180
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff8114b4dc>] __alloc_pages_nodemask+0x90c/0xb10
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff8150941d>] ? ip_rcv_finish+0x7d/0x350
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff81509ce4>] ? ip_rcv+0x234/0x380
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff814d01c0>] ? netif_receive_skb+0x40/0xd0
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff81188349>] alloc_pages_current+0xa9/0x170
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff8114629e>] __get_free_pages+0xe/0x50
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff811930ee>] kmalloc_order_trace+0x2e/0xa0
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff814cfb32>] ? __netif_receive_skb_core+0x282/0x870
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffff81194749>] __kmalloc+0x219/0x230
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffffa0145bca>] bnx2x_frag_alloc.isra.65+0x2a/0x40 [bnx2x]
> Nov 20 09:03:09 ceph14-06 kernel:  [<ffffffffa01461d4>] bnx2x_alloc_rx_data.isra.72+0x54/0x1c0 [bnx2x]
> Nov 20 09:03:09 ceph14-06 kernel: swapper/8: page allocation failure: order:2, mode:0x104020
>
> All oops seem to be triggered by page allocation failure.
>
> The effect of the oops is that the server has memory allocation errors all over the place , but mainly in the network stack. Not surprising since that would be the major activity.
> I have set vm swappiness to 0 on one node but it still generates the errors.
>
> Mem:      32732696   32507888     224808      51004          0   26187580
> -/+ buffers/cache:    6320308   26412388
> Swap:     31249404     308396   30941008
>
>
>
> Each oops is serious and affects the machine enough to trip nagios which scans each 5 minutes. It would appear that the node doesn't respond to the network for many seconds.
>
> ?
>
>
> A couple of observations:
> Affects mon/osd servers as well as just osd servers, although they don't seem to be any more or less affected.
>
> The OSD processes are affected on occasions but they do not seem to be using excessive memory
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 13571 root      20   0 1847536 581636   4968 S   2.0  1.8 211:09.58 ceph-osd
> 13707 root      20   0 1803560 523904   4956 S   2.0  1.6 184:22.69 ceph-osd
> 13997 root      20   0 1905820 580768   5088 S   1.7  1.8 182:28.36 ceph-osd
> 13436 root      20   0 1783656 544400   5076 S   1.3  1.7 216:53.34 ceph-osd
> 13840 root      20   0 1778296 570400   4380 S   1.3  1.7 184:09.06 ceph-osd
> 14154 root      20   0 1881804 617748   5460 S   1.3  1.9 227:42.08 ceph-osd
> 14356 root      20   0 1906236 593936   4512 S   1.3  1.8 188:28.77 ceph-osd
> 14491 root      20   0 1837232 546140   4264 S   1.0  1.7 182:27.13 ceph-osd
>
> The main culprit seems to be the vm page cache.
>
> Any recommendations?
>
> Regards
> Darryl
>
>
>
> ________________________________
>
> The contents of this electronic message and any attachments are intended only for the addressee and may contain legally privileged, personal, sensitive or confidential information. If you are not the intended addressee, and have received this email, any transmission, distribution, downloading, printing or photocopying of the contents of this message or attachments is strictly prohibited. Any legal privilege or confidentiality attached to this message and attachments is not waived, lost or destroyed by reason of delivery to any person other than intended addressee. If you have received this message and are not the intended addressee you should notify the sender by return email and destroy all copies of the message and any attachments. Unless expressly attributed, the views expressed in this email do not necessarily represent the views of the company.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com