Re: Ceph crash, how to analyse and recover

Christian Balzer <chibi@xxxxxxx> · Wed, 25 May 2016 16:35:48 +0900

Hello,

On Wed, 25 May 2016 06:43:05 +0000 Ammerlaan, A.J.G. wrote:

> 
> Hello Ceph Users,
> 
> We have a Ceph test cluster, that we want to bring into production and
> will grow rapidly in the future. Ceph version:
> ceph                                   0.80.7-2+deb8u1
> amd64        distributed storage and file system
> ceph-common                    0.80.7-2+deb8u1             amd64
> common utilities to mount and interact with a ceph storage cluster
>
I love and prefer to use native Debian packages whenever possible and
started my first Ceph cluster also with them.
But at this point they're severely outdated (there is a slightly newer
one, 0.80.10 IIRC) in backports, but the whole Firefly release is no
longer receiving any backports or bug fixes at this time.

Since you're testing, do yourself a favor and add the Ceph repository to
apt, then install Jewel (10.2.x).

> 
> Our config:
> 5 hosts with each running 12 OSDs
> containing 2 objects 
> One node went down and stayed down for about 12 hours
> Then it was brought back online (manually), the entire cluster slowly
> came to a halt with the current status being:
>
What does "manually" entail? 

There should be interesting bits in the logs, (OSD and global ceph), but
what you're seeing is not normal.

> First status after this crash:
> 
> cluster e2295d66-a265-11e5-8c92-00219bfd424c
>       health HEALTH_WARN 4628 pgs down; 4628 pgs peering; 4628 pgs stuck
> inactive; 4628 pgs stuck unclean
>       monmap e3: 3 mons at
> {a=172.30.0.2:6789/0,b=172.30.0.67:6789/0,mon=172.30.0.1:6789/0},
> election epoch 16, quorum 0,1,2 mon,a,b
>       osdmap e18880: 60 osds: 48 up, 48 in
>        pgmap v127495: 4628 pgs, 4 pools, 1238 bytes data, 4 objects
>              283 GB used, 130 TB / 130 TB avail
>                  4628 down+peering
>
Having 12 OSDs down and out is normal, having all your PGs down is not.
There's something fundamentally wrong with your cluster (network, custom
CRUSH rules, Ceph bug in your old version, running out of resources like
open files) and probably would have shown up w/o the node crash.

> The Ceph status at this moment:
> # ceph status
>     cluster e2295d66-a265-11e5-8c92-00219bfd424c
>      health HEALTH_WARN 4622 pgs down; 4628 pgs peering; 1427 pgs stale;
> 4628 pgs stuck inactive; 1427 pgs stuck stale; 4628 pgs stuck unclean;
> 2/17 in osds are down; 1 mons down, quorum 1,2 a,b monmap e3: 3 mons at
> {a=172.30.0.2:6789/0,b=172.30.0.67:6789/0,mon=172.30.0.1:6789/0},
> election epoch 18, quorum 1,2 a,b osdmap e19242: 60 osds: 15 up, 17 in
> pgmap v128135: 4628 pgs, 4 pools, 118 bytes data, 3 objects 100 GB used,
> 47383 GB / 47483 GB avail 3 peering 1424 stale+down+peering
>                 3198 down+peering
>                    3 stale+peering
> 
That's even worse, having OSDs die off during recovery in a cluster that's
essentially empty.

Again, check the logs, but in your shoes I would:

a) totally wipe out the cluster, all OSDs, mons, config, clean out
everything.

b) install Jewel.

c) re-deploy your cluster, preferably with ceph-deploy (which I don't
particular like, but is more likely to create a usable cluster if you're
new to Ceph).

Then when everything is running and health is OK and "rados bench" or
other tests work well and don't destroy your cluster, turn off one node to
repeat what happened to you.
Hopefully (certainly in my experience) things should just recover from
that.

Christian

> 
> 
> It is a test cluster, so no real harm done. How to get it back up, and
> why did this happen?
> 
> Regards, Arnoud.
> 
> ------------------------------------------------------------------------------
> 
> De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
> uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
> ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
> direct te informeren door het bericht te retourneren. Het Universitair
> Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de
> zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en
> staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland
> onder nr. 30244197.
> 
> Denk s.v.p aan het milieu voor u deze e-mail afdrukt.
> 
> ------------------------------------------------------------------------------
> 
> This message may contain confidential information and is intended
> exclusively for the addressee. If you receive this message
> unintentionally, please do not use the contents but notify the sender
> immediately by return e-mail. University Medical Center Utrecht is a
> legal person by public law and is registered at the Chamber of Commerce
> for Midden-Nederland under no. 30244197.
> 
> Please consider the environment before printing this e-mail.

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com