Hi, I am using ceph version 15.2.7 in 4 node cluster my OSD's is continuously stopping and even if I start again it stops after some time. I couldn't find anything from the log. I have set norecover and nobackfil as soon as I unset norecover OSD starts to fail. cluster: id: b6437922-3edf-11eb-adc2-0cc47a5ec98a health: HEALTH_ERR 1/6307061 objects unfound (0.000%) noout,nobackfill,norebalance,norecover,noscrub,nodeep-scrub flag(s) set 19 osds down 62477 scrub errors Reduced data availability: 75 pgs inactive, 12 pgs down, 57 pgs peering, 90 pgs stale Possible data damage: 1 pg recovery_unfound, 7 pgs inconsistent Degraded data redundancy: 3090660/12617416 objects degraded (24.495%), 394 pgs degraded, 399 pgs undersized 5 pgs not deep-scrubbed in time 127 daemons have recently crashed data: pools: 4 pools, 833 pgs objects: 6.31M objects, 23 TiB usage: 47 TiB used, 244 TiB / 291 TiB avail pgs: 9.004% pgs not active 3090660/12617416 objects degraded (24.495%) 315034/12617416 objects misplaced (2.497%) 1/6307061 objects unfound (0.000%) 368 active+undersized+degraded 299 active+clean 56 stale+peering 24 stale+active+clean 15 active+recovery_wait 12 active+undersized+remapped 11 active+undersized+degraded+remapped+backfill_wait 11 down 7 active+recovery_wait+degraded 7 active+clean+remapped 5 active+clean+remapped+inconsistent 5 stale+activating+undersized 4 active+recovering+degraded 2 stale+active+recovery_wait+degraded 1 active+recovery_unfound+undersized+degraded+remapped 1 stale+remapped+peering 1 stale+activating 1 stale+down 1 active+remapped+backfill_wait 1 active+undersized+remapped+inconsistent 1 active+undersized+degraded+remapped+inconsistent+backfill_wait what needs to be done to recover this? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx