TR: CEPH nightmare or not

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

We have a 3 ceph clusters (Hammer 0.94.5) on same physical nodes Using LXC on debian Wheezy. Each physical node has 12 4To 7200 RPM hard drive, 2x200Gb SSD MLC, 2 x 10 Gb ethernet. On each physical drive we have an lxc container for 1 OSD and the journal is on SSD partition.

 

One of our ceph clusters has 96 OSD with 1024 Pgp.

Last week we raised our Pgp from 1024 to 2048 in one pass. Bad idea L. You need to read the fucking manual before upgrading this kind of parameter.

Ceph was a bit stressed and can’t return to normal. A few OSD (~10%) were flapping

 

 

On our physical nodes, we noticed some network problems:

Ping 127.0.0.1:

64 bytes from 127.0.0.1: icmp_req=1258 ttl=64 time=0.146 ms

ping: sendmsg: Invalid argument

64 bytes from 127.0.0.1: icmp_req=1260 ttl=64 time=0.023 ms

ping: sendmsg: Invalid argument

64 bytes from 127.0.0.1: icmp_req=1262 ttl=64 time=0.028 ms

ping: sendmsg: Invalid argument

ping: sendmsg: Invalid argument

ping: sendmsg: Invalid argument

64 bytes from 127.0.0.1: icmp_req=1266 ttl=64 time=0.026 ms

64 bytes from 127.0.0.1: icmp_req=1267 ttl=64 time=0.142 ms

ping: sendmsg: Invalid argument

ping: sendmsg: Invalid argument

64 bytes from 127.0.0.1: icmp_req=1270 ttl=64 time=0.137 ms

ping: sendmsg: Invalid argument

 

 

With our kernel  (3.16) nothing in the logs.After a few days of research, we tried to upgrade kernel to a newer version (4.4.4). Not so easy to backport it to debian wheezy but after a few hours, it works. The problem wasn’t gone away but we noticed a new message in logs:

arp_cache: Neighbour table overflow.

 

In Debian , arp cache level 1 has only 128 records !

 

We had this to our sysctl.conf on every physical node:

net.ipv4.neigh.default.gc_thresh1 = 4096

net.ipv4.neigh.default.gc_thresh2 = 8192

net.ipv4.neigh.default.gc_thresh3 = 8192

net.ipv4.neigh.default.gc_stale_time = 86400

 

 

Immediately networks problems disappeared and our cluster came back to a better state in a few hours : HEALTH_OK J

 

 

To sum up:

Do not raise your pgp in one pass !

Look at your kernel parameters, you may need some tweaks to be fine

 

Regards

 

Pierre DOUCET

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux