Oh. That's strange; they are all mapped to two OSDs but are placed on
two different ones. I'm...not sure why that would happen. Are these
PGs active? What's the full output of "ceph -s"?
Those 4 PG’s went inactive at some point, and we had the luxury of time to understand how we arrived at this state before we truly have to fix it (but that time is soon).
So...We kicked a couple of OSD’s out yesterday to let the cluster re-shuffle things (osd.19 and osd.34…both of which were non-primary copies of the ‘acting’ PG map) and now the cluster status is even more interesting, IMHO:
ceph@nc48-n1:/ceph-deploy/nautilus$ ceph -s
cluster 68bc69c1-1382-4c30-9bf8-480e32cc5b92
health HEALTH_WARN 2 pgs stuck inactive; 2 pgs stuck unclean; nodeep-scrub flag(s) set; crush map has legacy tunables
monmap e1: 3 mons at {nc48-n1=10.253.50.211:6789/0,nc48-n2=10.253.50.212:6789/0,nc48-n3=10.253.50.213:6789/0}, election epoch 564, quorum 0,1,2 nc48-n1,nc48-n2,nc48-n3
osdmap e80862: 94 osds: 94 up, 92 in
pgmap v1954234: 6144 pgs, 2 pools, 35251 GB data, 4419 kobjects
91727 GB used, 245 TB / 334 TB avail
ceph@nc48-n1:/ceph-deploy/nautilus$ ceph pg dump_stuck
pg_stat objects mip degr unf bytes log disklog state state_stamp v reported up up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp
11.e2f 280 0 0 0 2339844181 3001 3001 remapped 2015-04-23 13:18:59.299589 68310'51082 80862:121916 [77,4] 77 [77,34] 77 68310'51082 2015-04-23 11:40:11.565487 0'0 2014-10-20 13:41:46.122624
11.323 282 0 0 0 2357186647 3001 3001 remapped 2015-04-23 13:18:58.970396 70105'48961 80862:126346 [0,37] 0 [0,19] 0 70105'48961 2015-04-23 11:47:02.980145 8145'44375 2015-03-30 16:09:36.975875
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com