Fighting with cephadm; inconsistent maintenance mode, forever starting daemons

grin <cephlist@xxxxxxxxxxxx> · Fri, 25 Mar 2022 14:22:07 +0100

Hello,

Wall of text warning! :-)

I am playing with a new 5 cluster install (actually now 4 since the
fifth server isn't started yet), reading and following the doc and
watching what happens. Pretty weirdly multiple things went wrong which
almost never happened with ceph in the past many years, and I feel
cephadm is still not quite ready for prime time. I considered opening
issues but maybe it's better to filter user errors from problems first,
so I depend on your expertise and help.

I decided to use pacific and podman 3.x, and first obstacle was the
mismatching documetation but that's already
https://tracker.ceph.com/issues/55053 and I have resolved it by manual
fiddling, so I ought to be on real pacific now, fingers crossed.

Small problem with first stages of installing ceph on debian stable
(bullseye): ceph repo keys are probably in old format so they aren't
automagically recognised, require apt-key add manually. 

Requirements neglect to mention a few small things, like examples need
curl and I am not sure whether missing firewalld caused some of the
pains later (at least I have seen complaining and error mentioning it,
but maybe that's just to b ignored). 

I think the Doc doesn't specifically mention that hostnames (especially
when using bare hostnames) shall be in the mon network prefix, which is
important if the host has a separate management subnet from mon and osd
subnets. 

Nevertheless default install of prometheus and the dashboard seems to
be broken because it tries to use IP instead of hostnames and fail on
the SSL cert not containing IP (as they shouldn't, really); this needs
a bit of googling and manual setting of URLs and things, otherwise the
dashboard is full of 500 errors without any explanation (since it only
gets logged into syslog in a very ugly and pretty verbose way).

My main problems are, however, started when adding new hosts. Everything
was connectable [ssh keys], requirements fulfilled (apart from the
mentioned firewalld), except some modifications _suggested_ a reboot,
which I didn't follow on two hosts (out of 4 online). 

The result was this: two working servers (master and the one I rebooted
for a different reason), and two non-working ones. The nonworking mean
state where master thinks they have mon/crash/vol/mgr running, but
nothing was running because podman died with a mysterious error; for
the record: 

ceph: stderr Error: cannot open sd-bus: No such file or directory: OCI not found

it was caused by not rebooting, as it turned out. However after reboot
the daemons were started, except they didn't join the cluster. Also
ceph orch had little effect on their behaviour, while cephadm seemed to
be able to do actions there, starting or stopping daemons, without much
use (so it isn't a connectivity issue).

I have tried to `ceph orch host (drain|rm)` them, which more or less
succeeded. It have resulted "stopping" state of the daemons' view on
the master... which still blocks rm. Tried to remove them manually, with
success. Then host rm. Worked.

Adding back the hosts (`ceph orch add host <name> <ip> _admin`) gave
the result of all the daemons in "starting" on master view, while
visibly running on the host, and not joining the cluster. It stays
"starting" forever.

root@alai:~# ceph orch ps olhado --refresh
NAME                  HOST    PORTS   STATUS    REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID   
crash.olhado          olhado          starting          -    -        -        -  <unknown>  <unknown>  
mon.olhado            olhado          starting          -    -        -    2048M  <unknown>  <unknown>  
node-exporter.olhado  olhado  *:9100  starting          -    -        -        -  <unknown>  <unknown>  

The next problem is to find the logs. It used to be so easy, go to
/var/log/ceph/ and find ceph-mon.0.log and likes. Not anymore. I guess
mon logs into syslog, using the generic name "conmon", probably mixed
up with the other daemons? (I have seen in the docs that I could force
it to create logfiles again, didn't feel like yet another
reconfiguration.)

Anyway, I see no error in the host syslog from mon, it said:
Mar 25 11:55:40 olhado conmon[33551]: debug 2022-03-25T10:55:40.083+0000 7fa24f6b0700  1 mon.olhado@-1(synchronizing) e2 sync_obtain_latest_monmap
Mar 25 11:55:40 olhado conmon[33551]: debug 2022-03-25T10:55:40.083+0000 7fa24f6b0700  1 mon.olhado@-1(synchronizing) e2 sync_obtain_latest_monmap obtained monmap e2

and the master says, every second:
Mar 25 14:19:24 alai conmon[822444]: debug 2022-03-25T13:19:24.729+0000 7f09c18df700  1 mon.alai@0(leader) e2  adding peer [v2:***:3300/0,v1:***:6789/0] to list of hints

but it's not in `ceph status` or `ceph mon dump` on the master (and
local `ceph` gets infinite wait, it's really hard to tell why since it
seems to be communicating with the master mon).

I have tried in due course putting them in maintenance mode since I
tried to get rid of "starting" state, first using ceph orch then by
local cephadm. These two does not seem to communicate well, which is
aptly demonstrated here:

root@alai:~# ceph health detail
HEALTH_WARN 1 host is in maintenance mode
[WRN] HOST_IN_MAINTENANCE: 1 host is in maintenance mode
    olhado is in maintenance

root@alai:~# ceph orch host maintenance exit olhado
Error EINVAL: Host olhado is not in maintenance mode

I guess I need to find a combo when cephadm and ceph orch see the same
state, probably enabling maint by cephadm then trying to remove using
ceph orch.

I also had a state when `ceph orch *` simply got into forever waiting,
and it was only resolvable by stopping ceph-volume on the host using
podman stop.  I was not able to see any log or debug which would have
explained to me what is happening and why.

Generally my problem is that I don't (yet?) see a simple way to see
what is happening:
- when ceph should deploy automagically but doesn't
- when cephadm/orch say "starting", "stopping" something and it doesn't
  change
- where are the daemon logs and how to follow them easily (right now
  I'm using `podman logs ...` but I am not sure it's the proper way)
- whether `ceph orch host rm` _really_ removes the host, so it can be
  added later, or does it need manual deletion of something? It seems
  it does since the aforementioned maintenance mode seems to have styed
  throughout removal.

I will scratch the thing soon, there is no harm done, apart from the
time spent on watching (or rather guessing) what cephadm does.

I wonder whether some of these are bugs to be fixed (either cephadm or
documentation) or are they all preventable user errors.

Sorry for the wall of text.

Thanks,
Peter

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx