TR: CEPH nightmare or not

Pierre DOUCET <pierre.doucet@xxxxxx> · Tue, 15 Mar 2016 09:54:06 +0000

Hi,

We have a 3 ceph clusters (Hammer 0.94.5) on same physical nodes Using LXC on debian Wheezy. Each physical node has 12 4To 7200 RPM hard drive, 2x200Gb SSD MLC, 2 x 10 Gb ethernet. On each physical drive we have an lxc
 container for 1 OSD and the journal is on SSD partition.

One of our ceph clusters has 96 OSD with 1024 Pgp.
Last week we raised our Pgp from 1024 to 2048 in one pass. Bad idea L. You need to read the fucking manual before upgrading this kind of
 parameter.
Ceph was a bit stressed and can’t return to normal. A few OSD (~10%) were flapping

On our physical nodes, we noticed some network problems:
Ping 127.0.0.1:
64 bytes from 127.0.0.1: icmp_req=1258 ttl=64 time=0.146 ms
ping: sendmsg: Invalid argument
64 bytes from 127.0.0.1: icmp_req=1260 ttl=64 time=0.023 ms
ping: sendmsg: Invalid argument
64 bytes from 127.0.0.1: icmp_req=1262 ttl=64 time=0.028 ms
ping: sendmsg: Invalid argument
ping: sendmsg: Invalid argument
ping: sendmsg: Invalid argument
64 bytes from 127.0.0.1: icmp_req=1266 ttl=64 time=0.026 ms
64 bytes from 127.0.0.1: icmp_req=1267 ttl=64 time=0.142 ms
ping: sendmsg: Invalid argument
ping: sendmsg: Invalid argument
64 bytes from 127.0.0.1: icmp_req=1270 ttl=64 time=0.137 ms
ping: sendmsg: Invalid argument

With our kernel  (3.16) nothing in the logs.After a few days of research, we tried to upgrade kernel to a newer version (4.4.4). Not so easy to backport it to debian wheezy but after a few hours, it works. The problem
 wasn’t gone away but we noticed a new message in logs:
arp_cache: Neighbour table overflow.

In Debian , arp cache level 1 has only 128 records !

We had this to our sysctl.conf on every physical node:
net.ipv4.neigh.default.gc_thresh1 = 4096
net.ipv4.neigh.default.gc_thresh2 = 8192
net.ipv4.neigh.default.gc_thresh3 = 8192
net.ipv4.neigh.default.gc_stale_time = 86400

Immediately networks problems disappeared and our cluster came back to a better state in a few hours : HEALTH_OK
J

To sum up:
Do not raise your pgp in one pass !
Look at your kernel parameters, you may need some tweaks to be fine

Regards

Pierre DOUCET

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com