ceph osd crush swap-bucket causes monitor catestrophic failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Today I made a mistake running a playbook and accidentally executed

 

ceph osd crush swap-bucket {old_host} {new_host}

 

where {old_host}={new_host}

 

After that command, the first two monitors immediately stopped responding and crashed.  The 3rd monitor service was still running, but of course not able to do anything w/o quorum.  So I proceeded to edit its monmap down to a single node configuration.  Upon restarting its service, it came up for <1minute and then also crashed the same as the others.

 

These errors were found in logs:

Mar  8 03:10:44 pistoremon-as-d01-tier1 ceph-mon[3654621]: /build/ceph-14.2.22/src/crush/CrushWrapper.cc: In function 'int CrushWrapper::swap_bucket(CephContext*, int, int)' thread 7f878de42700 time 2022-03-08 03:10:44.945920

Mar  8 03:10:44 pistoremon-as-d01-tier1 ceph-mon[3654621]: /build/ceph-14.2.22/src/crush/CrushWrapper.cc: 1279: FAILED ceph_assert(b->size == bs)

 

I have since attempted to follow these steps: 

 

https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

 

in order to rebuild the kv_store and get a monitor working again. 

 

This has been so far unsuccessful, and I don’t seem to be getting very far.  I am able to sometimes get a single ceph-mon service to start IF, there are 2 other monitors in the monmap, so that a quorum is never formed.

 

When a ceph-mon service won’t start I usually cannot find any log errors beyond:

Mar  8 09:02:53 pistoremon-as-d02-tier1 systemd[1]: ceph-mon@pistoremon-as-d02-tier1.service: Start request repeated too quickly.

Mar  8 09:02:53 pistoremon-as-d02-tier1 systemd[1]: ceph-mon@pistoremon-as-d02-tier1.service: Failed with result 'start-limit-hit'.

 

Here’s some important details.

The cluster is all Nautilus 14.2.22

Most of the cluster OSDs are still filestore.  The point of the bucketswap was for bluestore migrations, of which about 3 have been completed.

The monitor hosts were still all LevelDB.  However, the exported process above appears to have generated RocksDB output, and I had to manually change the kv_backend for that dump to do any good

 

We are in the middle of a major outage, so any expeditious assistance is unfathomably appreciated.

 

Thanks,

Josh

 

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux