Hello Ceph Users,
We have a Ceph test cluster, that we want to bring into production and will grow rapidly in the future. Ceph version: ceph 0.80.7-2+deb8u1 amd64 distributed storage and file system ceph-common 0.80.7-2+deb8u1 amd64 common utilities to mount and interact with a ceph storage cluster Our config: 5 hosts with each running 12 OSDs containing 2 objects One node went down and stayed down for about 12 hours Then it was brought back online (manually), the entire cluster slowly came to a halt with the current status being: First status after this crash: cluster e2295d66-a265-11e5-8c92-00219bfd424c health HEALTH_WARN 4628 pgs down; 4628 pgs peering; 4628 pgs stuck inactive; 4628 pgs stuck unclean monmap e3: 3 mons at {a=172.30.0.2:6789/0,b=172.30.0.67:6789/0,mon=172.30.0.1:6789/0}, election epoch 16, quorum 0,1,2 mon,a,b osdmap e18880: 60 osds: 48 up, 48 in pgmap v127495: 4628 pgs, 4 pools, 1238 bytes data, 4 objects 283 GB used, 130 TB / 130 TB avail 4628 down+peering The Ceph status at this moment: # ceph status cluster e2295d66-a265-11e5-8c92-00219bfd424c health HEALTH_WARN 4622 pgs down; 4628 pgs peering; 1427 pgs stale; 4628 pgs stuck inactive; 1427 pgs stuck stale; 4628 pgs stuck unclean; 2/17 in osds are down; 1 mons down, quorum 1,2 a,b monmap e3: 3 mons at {a=172.30.0.2:6789/0,b=172.30.0.67:6789/0,mon=172.30.0.1:6789/0}, election epoch 18, quorum 1,2 a,b osdmap e19242: 60 osds: 15 up, 17 in pgmap v128135: 4628 pgs, 4 pools, 118 bytes data, 3 objects 100 GB used, 47383 GB / 47483 GB avail 3 peering 1424 stale+down+peering 3198 down+peering 3 stale+peering It is a test cluster, so no real harm done. How to get it back up, and why did this happen? Regards, Arnoud. De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. Het Universitair Medisch Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W. (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197. Denk s.v.p aan het milieu voor u deze e-mail afdrukt. This message may contain confidential information and is intended exclusively for the addressee. If you receive this message unintentionally, please do not use the contents but notify the sender immediately by return e-mail. University Medical Center Utrecht is a legal person by public law and is registered at the Chamber of Commerce for Midden-Nederland under no. 30244197. Please consider the environment before printing this e-mail. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com