Hi,
We have a cephadm-based Octopus (upgraded to 15.2.17 today but the
problem started with 15.2.16) cluster where we try to deploy a RGW in
multisite configuration. We followed the documentation at
https://docs.ceph.com/en/octopus/radosgw/multisite/ to do the basic
realm, zonegroup, zone and pool configuration. We then deployed a RGW
with "ceph orch apply rgw...". The Ceph image is loaded (the first
time), the container starts and immediately exits with the status code
5/NOTINSTALLED. I was unabled to find any error message in the logs that
could be the cause of the problem. I reinstalled the machine hosting the
RGW (running yesterday's build of CentOS Stream 8 but the problem
started with a build from early August) and removed the pools including
.rgw.root, recreated everything and the problem remains the same. A
start sequence in /var/log/messages is:
-----
Sep 28 18:55:49 valvd-rgw1 systemd[1]: Starting Ceph
rgw.eros.eros.valvd-rgw1.eaafgz for cce5ffb0-9124-40e5-a55c-3e5cc8660d47...
Sep 28 18:55:50 valvd-rgw1 systemd[1]:
var-lib-containers-storage-overlay.mount: Succeeded.
Sep 28 18:55:50 valvd-rgw1 systemd[1]:
var-lib-containers-storage-overlay.mount: Succeeded.
Sep 28 18:55:50 valvd-rgw1 systemd[1]: Started libcontainer container
86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48.
Sep 28 18:55:50 valvd-rgw1 bash[54606]:
86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48
Sep 28 18:55:50 valvd-rgw1 systemd[1]: Started Ceph
rgw.eros.eros.valvd-rgw1.eaafgz for cce5ffb0-9124-40e5-a55c-3e5cc8660d47.
Sep 28 18:55:51 valvd-rgw1 systemd[1]:
libpod-86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48.scope:
Succeeded.
Sep 28 18:55:51 valvd-rgw1 systemd[1]:
libpod-86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48.scope:
Consumed 297ms CPU time
Sep 28 18:55:51 valvd-rgw1 systemd[1]:
var-lib-containers-storage-overlay-d323c9431c7f488696f7e467397a94192a022b2b13155fb6a34f80236330dff3-merged.mount:
Succeeded.
Sep 28 18:55:51 valvd-rgw1 systemd[1]:
var-lib-containers-storage-overlay.mount: Succeeded.
Sep 28 18:55:51 valvd-rgw1 systemd[1]:
ceph-cce5ffb0-9124-40e5-a55c-3e5cc8660d47@xxxxxxxxxxxxx.valvd-rgw1.eaafgz.service:
Main process exited, code=exited, status=5/NOTINSTALLED
Sep 28 18:55:51 valvd-rgw1 systemd[1]:
ceph-cce5ffb0-9124-40e5-a55c-3e5cc8660d47@xxxxxxxxxxxxx.valvd-rgw1.eaafgz.service:
Failed with result 'exit-code'.
----
I googled and found a couple of similar issues reported to the list, in
particular
- https://www.mail-archive.com/ceph-users@xxxxxxx/msg09680.html but it
is with Pacific and not a cephadm-based cluster so may be something
different and the workaround doesn't apply as the config option doesn't
exist in Octopus
-
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/UDUDIHZN5NKTUFQQV7OK2FYYNFVTL2XS/
which is with Octopus but was related to a configuration with multiple
realms when I have only one defined as the default one (called eros)
Both mention that they found the error "Couldn't init storage provider
(RADOS", something I have not seen. I may have missed it as I don't know
exactly in which log file I should find it.
I certainly did a trivial mistake but I'm stuck with this problem for
quite some time, without any clue about where is the issue. Thanks in
advance for your help.
Cheers,
Michel
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx