Hi All,
I don't know how its happened (bad backup/restore, bad config file
somewhere, I don't know) but my (DEV) Ceph Cluster is in a very bad
state, and I'm looking for pointers/help in getting it back running
(unfortunate, a complete rebuild/restore is *not* an option).
This is on Ceph Reef (on Rocky 9) which was converted to CephAdm from a
manual install a few weeks ago (which worked). Five days ago everything
when "t!ts-up" (an Ozzie technical ICT term meaning nothing works :-) )
So, my (first?) issue is that I can't get any Managers to come up clean.
Each one tries to connect on an ip subnet which doesn't exist any longer
and hasn't for a couple of years.
The second issue is that (possible because of the first) every `ceph
orch` command just hangs. Cephadm commands work fine.
I've checked, checked, and checked again that the individual config
files all point towards the correct ip subnet for the monitors, and I
cannot find any trace of the old subnet's ip address in any config file
(that I can find).
For the record I am *not* a "podman guy" so there may be something there
that's causing my issue(s?) but I don't know where to even begin to look.
Any/all logs simply start that the Manager(s) try to come up, can't find
an address in the "old" subnet, and so fail - nothing else helpful (at
least to me).
I've even pulled a copy of the monmap and its showing the correct ip
subnet addresses for the monitors.
The firewalls are all set correctly and an audit2allow shows nothing is
out of place, as does disabling SELinux (ie no change).
A `ceph -s` shows I've got no active managers (and that a monitor is
down - that's my third issue), plus a whole bunch of osds and pgs aren't
happy either. I have, though, got a monitor quorum.
So, what should I be looking at / where should I be looking? Any help is
greatly *greatly* appreciated.
Cheers
Dulux-Oz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx