Dear All, I have a Mimic (13.2.0) cluster, which, due to a bad disk controller, corrupted three Bluestore OSD's on one node. Unfortunately these three OSD's crash when they try to start. systemctl start ceph-osd@193 (snip) /BlueFS.cc: 828: FAILED assert(r != q->second->file_map.end()) Full log here: http://p.ip.fi/yFYn "ceph-bluestore-tool repair" also crashes, with a similar error in BlueFS.cc # ceph-bluestore-tool repair --dev /dev/sdc2 --path /var/lib/ceph/osd/ceph-193 (snip) /BlueFS.cc: 828: FAILED assert(r != q->second->file_map.end()) Full log here: http://p.ip.fi/l_Q_ This command works OK: # ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-193 inferring bluefs devices from bluestore path { "/var/lib/ceph/osd/ceph-193/block": { "osd_uuid": "90b25336-9932-4e0b-a16b-51159568c398", "size": 8001457295360, "btime": "2017-12-08 15:46:40.034495", "description": "main", "bluefs": "1", "ceph_fsid": "f035ee98-abfd-4496-b903-a403b29c828f", "kv_backend": "rocksdb", "magic": "ceph osd volume v026", "mkfs_done": "yes", "ready": "ready", "whoami": "193" } } # lsblk | grep sdc sdc 8:32 0 7.3T 0 disk ├─sdc1 8:33 0 100M 0 part /var/lib/ceph/osd/ceph-193 └─sdc2 8:34 0 7.3T 0 part Since the OSD's failed, the Cluster has rebalanced, though I still have ceph HEALTH_ERR: 95 scrub errors; Possible data damage: 11 pgs inconsistent Manual scrubs are not started by the OSD demons (reported elsewhere, see "ceph pg scrub" does not start) Looking at the old logs, I see ~3500 entries in the logs of the bad OSDs, all similar to: -9> 2018-07-04 14:42:34.744 7f9ef0bbb1c0 2 rocksdb: [/root/ceph-build/ceph-13.2.0/src/rocksdb/db/version_set.cc:1330] Unable to load table properties for file 43530 --- Corruption: bad block contents���5b There are a much smaller number of crc errors, similar to : 2> 2018-07-02 12:58:07.702 7fd3649eb1c0 -1 bluestore(/var/lib/ceph/osd/ceph-425) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xff625379, expected 0x75b558bc, device location [0xf5a66e0000~1000], logical extent 0x0~1000, object #-1:2c691ffb:::osdmap.176500:0# I'm inclined to wipe these three OSD's and start again, but am happy to try suggestions to repair. thanks for any suggestions, Jake _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com