Re: Ceph newbie(?) issues

Ronny Aasen <ronny+ceph-users@xxxxxxxx> · Mon, 5 Mar 2018 13:13:27 +0100

On 05. mars 2018 11:21, Jan Marquardt wrote:
Hi,

we are relatively new to Ceph and are observing some issues, where
I'd like to know how likely they are to happen when operating a
Ceph cluster.

Currently our setup consists of three servers which are acting as
OSDs and MONs. Each server has two Intel Xeon L5420 (yes, I know,
it's not state of the art, but we thought it would be sufficient
for a Proof of Concept. Maybe we were wrong?) and 24 GB RAM and is
running 8 OSDs with 4 TB harddisks. 4 OSDs are sharing one SSD for
journaling. We started on Kraken and upgraded lately to Luminous.
The next two OSD servers and three separate MONs are ready for
deployment. Please find attached our ceph.conf. Current usage looks
like this:

data:
   pools:   1 pools, 768 pgs
   objects: 5240k objects, 18357 GB
   usage:   59825 GB used, 29538 GB / 89364 GB avail

We have only one pool which is exclusively used for rbd. We started
filling it with data and creating snapshots in January until Mid of
February. Everything was working like a charm until we started
removing old snapshots then.

While we were removing snapshots for the first time, OSDs started
flapping. Besides this there was no other load on the cluster.
For idle times we solved it by adding

osd snap trim priority = 1
osd snap trim sleep = 0.1

to ceph.conf. When there is load from other operations and we
remove big snapshots OSD flapping still occurs.

Last week our first scrub errors appeared. Repairing the first
one was no big deal. The second one however was, because the
instructed OSD started crashing. First on Friday osd.17 and
today osd.11.

ceph1:~# ceph pg repair 0.1b2
instructing pg 0.1b2 on osd.17 to repair

ceph1:~# ceph pg repair 0.1b2
instructing pg 0.1b2 on osd.11 to repair

I am still researching on the crashes, but already would be
thankful for any input.

Any opinions, hints and advices would really be appreciated.

i had some similar issues when i started my proof of concept. especialy 
the snapshot deletion i remember well.

the rule of thumb for filestore that i assume you are running is 1GB ram 
per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for 
osd's + some  GB's for the mon service, + some GB's  for the os itself.

i suspect if you inspect your dmesg log and memory graphs you will find 
that the out of memory killer ends your osd's when the snap deletion (or 
any other high load task) runs.

I ended up reducing the number of osd's per node, since the old 
mainboard i used was maxed for memory.

corruptions occured for me as well. and they was normaly associated with 
disks dying or giving read errors. ceph often managed to fix them but 
sometimes i had to just remove the hurting OSD disk.

hage some graph's  to look at. personaly i used munin/munin-node since 
it was just an apt-get away from functioning graphs

also i used smartmontools to send me emails about hurting disks.
and smartctl to check all disks for errors.

good luck with ceph !

kinds regards
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com