Re: OSD corruption and down PGs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you share your osd tree and the current ceph status?


Zitat von Kári Bertilsson <karibertils@xxxxxxxxx>:

Hello

I had an incidence where 3 OSD's crashed at once completely and won't power
up. And during recovery 3 OSD's in another host have somehow become
corrupted. I am running erasure coding with 8+2 setup using crush map which
takes 2 OSDs per host, and after losing the other 2 OSD i have few PG's
down. Unfortunately these PG's seem to overlap almost all data on the pool,
so i believe the entire pool is mostly lost after only these 2% of PG's
down.

I am running ceph 14.2.9.

OSD 92 log https://pastebin.com/5aq8SyCW
OSD 97 log https://pastebin.com/uJELZxwr

ceph-bluestore-tool repair without --deep showed "success" but OSD's still
fail with the log above.

Log from trying ceph-bluestore-tool repair --deep which is still running,
not sure if it will actually fix anything and log looks pretty bad.
https://pastebin.com/gkqTZpY3

Trying "ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-97 --op
list" gave me input/output error. But everything in SMART looks OK, and i
see no indication of hardware read error in any logs. Same for both OSD.

The OSD's with corruption have absolutely no bad sectors and likely have
only a minor corruption but at important locations.

Any ideas on how to recover this kind of scenario ? Any tips would be
highly appreciated.

Best regards,
Kári Bertilsson
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux