Hi Loan, thanks for the detailed post-mortem to the list! I misread your first message, unfortunately. On our cluster we also had issues with 1-2 PGs being stuck in peering resulting in blocked IO and warnings piling up. We identified the "bad" OSD by shutting one member-OSD down at a time and setting it out, so it was in state down+out. As soon as the bad OSD was down+out, the PG recovered and became active. In our case the disks were bad and we replaced them. I thought you had done that, but after re-reading it was restarts only, which will not force a remapping. Sorry for the confusion and hopefully our experience reports here help other users. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx