Hi,
After disaster and restarting for automatic recovery, I found following ceph status. Some OSDs cannot be restarted due to file system corruption (it seem that xfs is fragile).
[root@management-b ~]# ceph status
cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627
health HEALTH_WARN
209 pgs degraded
209 pgs stuck degraded
334 pgs stuck unclean
209 pgs stuck undersized
209 pgs undersized
recovery 5354/77810 objects degraded (6.881%)
recovery 1105/77810 objects misplaced (1.420%)
monmap e1: 3 mons at {management-a=10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0}
election epoch 2308, quorum 0,1,2 management-a,management-b,management-c
osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs
flags sortbitwise
pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905 objects
678 GB used, 90444 GB / 91123 GB avail
5354/77810 objects degraded (6.881%)
1105/77810 objects misplaced (1.420%)
2226 active+clean
209 active+undersized+degraded
125 active+remapped
client io 0 B/s rd, 282 kB/s wr, 10 op/s
Since total active PGs same with total PGs and total degraded PGs same with total undersized PGs, does it mean that all PGs have at least one good replica, so I can just mark lost or remove down OSD, reformat again and then restart them if there is no hardware issue with HDDs? Which one of PGs status should I pay more attention, degraded or undersized due to lost object possibility?
Best regards,
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com