pg is stuck stale (osd.21 still removed)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

we had a HW-problem with OSD.21 today. The OSD daemon was down and "smartctl" told me about some hardware errors.

I decided to remove the HDD:

          ceph osd out 21
          ceph osd crush remove osd.21
          ceph auth del osd.21
          ceph osd rm osd.21

But afterwards I saw that I have some stucked pg's for osd.21: 

	root@ceph-admin:~# ceph -w
	    cluster c7b12656-15a6-41b0-963f-4f47c62497dc
	     health HEALTH_WARN
      	      50 pgs stale
            	50 pgs stuck stale
	     monmap e4: 3 mons at {ceph-mon1=192.168.135.31:6789/0,ceph-mon2=192.168.135.32:6789/0,ceph-mon3=192.168.135.33:6789/0}
      	      election epoch 404, quorum 0,1,2 ceph-mon1,ceph-mon2,ceph-mon3
	     mdsmap e136: 1/1/1 up {0=ceph-mon1=up:active}
	     osdmap e18259: 23 osds: 23 up, 23 in
	      pgmap v47879105: 6656 pgs, 10 pools, 23481 GB data, 6072 kobjects
      	      54974 GB used, 30596 GB / 85571 GB avail
            	    6605 active+clean
                  	50 stale+active+clean
	                   1 active+clean+scrubbing+deep

	root@ceph-admin:~# ceph health
	HEALTH_WARN 50 pgs stale; 50 pgs stuck stale

	root@ceph-admin:~# ceph health detail
	HEALTH_WARN 50 pgs stale; 50 pgs stuck stale; noout flag(s) set
	pg 34.225 is stuck stale for 98780.399254, current state stale+active+clean, last acting [21]
	pg 34.186 is stuck stale for 98780.399195, current state stale+active+clean, last acting [21]
	...

	root@ceph-admin:~# ceph pg 34.225   query
	Error ENOENT: i don't have pgid 34.225

	root@ceph-admin:~# ceph pg 34.225  list_missing
	Error ENOENT: i don't have pgid 34.225

	root@ceph-admin:~# ceph osd lost 21  --yes-i-really-mean-it
	osd.21 is not down or doesn't exist

	# checking the crushmap
      ceph osd getcrushmap -o crush.map
      crushtool -d crush.map  -o crush.txt
	root@ceph-admin:~# grep 21 crush.txt
		-> nothing here....


Of course, I cannot start OSD.21, because it's not available anymore - I removed it.

Is there a way to remap the stucked pg's to other OSD's than osd.21? How can I help my cluster (ceph 0.94.2)?

best regards
Danny

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux