Can you share your osd tree and the current ceph status? Zitat von Kári Bertilsson <karibertils@xxxxxxxxx>:
Hello I had an incidence where 3 OSD's crashed at once completely and won't power up. And during recovery 3 OSD's in another host have somehow become corrupted. I am running erasure coding with 8+2 setup using crush map which takes 2 OSDs per host, and after losing the other 2 OSD i have few PG's down. Unfortunately these PG's seem to overlap almost all data on the pool, so i believe the entire pool is mostly lost after only these 2% of PG's down. I am running ceph 14.2.9. OSD 92 log https://pastebin.com/5aq8SyCW OSD 97 log https://pastebin.com/uJELZxwr ceph-bluestore-tool repair without --deep showed "success" but OSD's still fail with the log above. Log from trying ceph-bluestore-tool repair --deep which is still running, not sure if it will actually fix anything and log looks pretty bad. https://pastebin.com/gkqTZpY3 Trying "ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-97 --op list" gave me input/output error. But everything in SMART looks OK, and i see no indication of hardware read error in any logs. Same for both OSD. The OSD's with corruption have absolutely no bad sectors and likely have only a minor corruption but at important locations. Any ideas on how to recover this kind of scenario ? Any tips would be highly appreciated. Best regards, Kári Bertilsson _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx