15.2.17: RGW deploy through cephadm exits immediately with exit code 5/NOTINSTALLED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We have a cephadm-based Octopus (upgraded to 15.2.17 today but the problem started with 15.2.16) cluster where we try to deploy a RGW in multisite configuration. We followed the documentation at https://docs.ceph.com/en/octopus/radosgw/multisite/ to do the basic realm, zonegroup, zone and pool configuration. We then deployed a RGW with "ceph orch apply rgw...". The Ceph image is loaded (the first time), the container starts and immediately exits with the status code 5/NOTINSTALLED. I was unabled to find any error message in the logs that could be the cause of the problem. I reinstalled the machine hosting the RGW (running yesterday's build of CentOS Stream 8 but the problem started with a build from early August) and removed the pools including .rgw.root, recreated everything and the problem remains the same. A start sequence in /var/log/messages is:

-----

Sep 28 18:55:49 valvd-rgw1 systemd[1]: Starting Ceph rgw.eros.eros.valvd-rgw1.eaafgz for cce5ffb0-9124-40e5-a55c-3e5cc8660d47... Sep 28 18:55:50 valvd-rgw1 systemd[1]: var-lib-containers-storage-overlay.mount: Succeeded. Sep 28 18:55:50 valvd-rgw1 systemd[1]: var-lib-containers-storage-overlay.mount: Succeeded. Sep 28 18:55:50 valvd-rgw1 systemd[1]: Started libcontainer container 86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48. Sep 28 18:55:50 valvd-rgw1 bash[54606]: 86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48 Sep 28 18:55:50 valvd-rgw1 systemd[1]: Started Ceph rgw.eros.eros.valvd-rgw1.eaafgz for cce5ffb0-9124-40e5-a55c-3e5cc8660d47. Sep 28 18:55:51 valvd-rgw1 systemd[1]: libpod-86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48.scope: Succeeded. Sep 28 18:55:51 valvd-rgw1 systemd[1]: libpod-86707b0c2658f21a09f229a2e049c922dd697475768d6fa3a31bfc223a1eda48.scope: Consumed 297ms CPU time Sep 28 18:55:51 valvd-rgw1 systemd[1]: var-lib-containers-storage-overlay-d323c9431c7f488696f7e467397a94192a022b2b13155fb6a34f80236330dff3-merged.mount: Succeeded. Sep 28 18:55:51 valvd-rgw1 systemd[1]: var-lib-containers-storage-overlay.mount: Succeeded. Sep 28 18:55:51 valvd-rgw1 systemd[1]: ceph-cce5ffb0-9124-40e5-a55c-3e5cc8660d47@xxxxxxxxxxxxx.valvd-rgw1.eaafgz.service: Main process exited, code=exited, status=5/NOTINSTALLED Sep 28 18:55:51 valvd-rgw1 systemd[1]: ceph-cce5ffb0-9124-40e5-a55c-3e5cc8660d47@xxxxxxxxxxxxx.valvd-rgw1.eaafgz.service: Failed with result 'exit-code'.
----

I googled and found a couple of similar issues reported to the list, in particular

- https://www.mail-archive.com/ceph-users@xxxxxxx/msg09680.html but it is with Pacific and not a cephadm-based cluster so may be something different and the workaround doesn't apply as the config option doesn't exist in Octopus

- https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/UDUDIHZN5NKTUFQQV7OK2FYYNFVTL2XS/ which is with Octopus but was related to a configuration with multiple realms when I have only one defined as the default one (called eros)

Both mention that they found the error "Couldn't init storage provider (RADOS", something I have not seen. I may have missed it as I don't know exactly in which log file I should find it.

I certainly did a trivial mistake but I'm stuck with this problem for quite some time, without any clue about where is the issue. Thanks in advance for your help.

Cheers,

Michel

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux