Re: Multiple OSD crashing within short timeframe in production cluster running pacific

Janne Johansson <icepic.dz@xxxxxxxxx> · Mon, 13 Sep 2021 16:35:04 +0200

Den mån 13 sep. 2021 kl 16:18 skrev Kári Bertilsson <karibertils@xxxxxxxxx>:
>
> Hello everyone,
> errors. Yesterday the whole cluster was restarted, 2 OSD's in one server
> would not come back up, while all other OSD's were fine. Since most PG's
> had 10/10 available and a few had 9/10, I wasn't very worried and wiped the
> disks and started recovery. Both drives were crashing with the log below.
> bluestore(/var/lib/ceph/osd/ceph-244) _open_db_and_around read-only:0
> repair:0
> 2021-09-12T07:44:09.393+0000 7f6539879f00 -1
> bluestore(/var/lib/ceph/osd/ceph-244) _open_db erroring opening db:
> 2021-09-12T07:44:09.925+0000 7f6539879f00 -1 osd.244 0 OSD:init: unable to
> mount object store
> 2021-09-12T07:44:09.925+0000 7f6539879f00 -1  ** ERROR: osd init failed:
> (5) Input/output error

> bluestore(/var/lib/ceph/osd/ceph-229) _open_db_and_around read-only:0
> repair:0
> 2021-09-12T07:44:13.089+0000 7f22575b7f00 -1
> bluestore(/var/lib/ceph/osd/ceph-229) _open_db erroring opening db:
> 2021-09-12T07:44:13.613+0000 7f22575b7f00 -1 osd.229 0 OSD:init: unable to
> mount object store
> 2021-09-12T07:44:13.613+0000 7f22575b7f00 -1  ** ERROR: osd init failed:
> (5) Input/output error

Not sure, but could it be that the DB devices "jumped around" and the
OSD 244 is looking for (say) sda
and OSD.229 looks for sdb, but the letters moved after reboot?

I think I have seen this when I had DB on a separate partition on a
drive. The "find OSD by magic" part found
the data partition on sdb1 and should have used sdb2 for DB, but if
sdb used to be sda before, then it should have
been sda1 using DB on sda2, but now they get flipped around so they
see the wrong DB for their data.

A small possibility, but perhaps worth checking, if it can help you
get data back.
(and worth writing down before any more reboots)

-- 
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx