Hello Mathias and others,
I also ran into this problem after upgrading from 16.2.9 to 17.2.1.
Additionally I observed a health warning: "3 mgr modules have recently
crashed".
Those are actually two distinct crashes that are already in the tracker:
https://tracker.ceph.com/issues/56269 and
https://tracker.ceph.com/issues/56270
Considering that the crashes are in the snap_schedule module I assume
that they are the reason why the module is not available.
I can reproduce the crash in 56270 by failing over the mgr.
I believe that the faulty code causing the error is this line:
https://github.com/ceph/ceph/blob/v17.2.1/src/pybind/mgr/snap_schedule/fs/schedule_client.py#L193
Instead of ioctx.remove(SNAP_DB_OBJECT_NAME) it should be
ioctx.remove_object(SNAP_DB_OBJECT_NAME).
(According to my understanding of
https://docs.ceph.com/en/latest/rados/api/python/.)
Best regards,
Andreas
On 01.07.22 18:05, Kuhring, Mathias wrote:
Dear Ceph community,
After upgrading our cluster to Quincy with cephadm (ceph orch upgrade start --image quay.io/ceph/ceph:v17.2.1), I struggle to re-activate the snapshot schedule module:
0|0[root@osd-1 ~]# ceph mgr module enable snap_schedule
0|1[root@osd-1 ~]# ceph mgr module ls | grep snap
snap_schedule on
0|0[root@osd-1 ~]# ceph fs snap-schedule list / --recursive
Error ENOENT: Module 'snap_schedule' is not available
I tried restarting the MGR daemons and failed over a restarted one. But with no change.
0|0[root@osd-1 ~]# ceph orch restart mgr
Scheduled to restart mgr.osd-1 on host 'osd-1'
Scheduled to restart mgr.osd-2 on host 'osd-2'
Scheduled to restart mgr.osd-3 on host 'osd-3'
Scheduled to restart mgr.osd-4.oylrhe on host 'osd-4'
Scheduled to restart mgr.osd-5.jcfyqe on host 'osd-5'
0|0[root@osd-1 ~]# ceph orch ps --daemon_type mgr
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
mgr.osd-1 osd-1 *:8443,9283 running (61s) 35s ago 9M 402M - 17.2.1 e5af760fa1c1 64f7ec70a6aa
mgr.osd-2 osd-2 *:8443,9283 running (47s) 36s ago 9M 103M - 17.2.1 e5af760fa1c1 d25fdc793ff8
mgr.osd-3 osd-3 *:8443,9283 running (7h) 36s ago 9M 457M - 17.2.1 e5af760fa1c1 46d5091e50d6
mgr.osd-4.oylrhe osd-4 *:8443,9283 running (7h) 79s ago 9M 795M - 17.2.1 e5af760fa1c1 efb2a7cc06c5
mgr.osd-5.jcfyqe osd-5 *:8443,9283 running (8h) 37s ago 9M 448M - 17.2.1 e5af760fa1c1 96dd03817f32
0|0[root@osd-1 ~]# ceph mgr fail
The MGR confirms, that the snap_schedule module is not available:
0|0[root@osd-1 ~]# journalctl -eu ceph-55633ec3-6c0c-4a02-990c-0f87e0f7a01f@xxxxxxx-1.service<mailto:ceph-55633ec3-6c0c-4a02-990c-0f87e0f7a01f@xxxxxxx-1.service>
Jul 01 16:25:49 osd-1 bash[662895]: debug 2022-07-01T14:25:49.825+0000 7f0486408700 0 log_channel(audit) log [DBG] : from='client.90801080 -' entity='client.admin' cmd=[{"prefix": "fs snap-schedule list", "path": "/", "recursive": true, "target": ["mon-mgr", ""]}]: dispatch
Jul 01 16:25:49 osd-1 bash[662895]: debug 2022-07-01T14:25:49.825+0000 7f0486c09700 -1 mgr.server reply reply (2) No such file or directory Module 'snap_schedule' is not available
But I'm not sure where the MGR is actually looking. The module path is:
0|22[root@osd-1 ~]# ceph config get mgr mgr_module_path
/usr/share/ceph/mgr
And while it is not available on the host (I assume these are just remnants from before our change to cephadm/docker, anyways):
0|0[root@osd-1 ~]# ll /usr/share/ceph/mgr
...
drwxr-xr-x. 4 root root 144 22. Sep 2021 restful
drwxr-xr-x. 3 root root 61 22. Sep 2021 selftest
drwxr-xr-x. 3 root root 61 22. Sep 2021 status
drwxr-xr-x. 3 root root 117 22. Sep 2021 telegraf
...
The module is available in the MGR container (which I assume is where the MGR would look):
0|0[root@osd-1 ~]# docker exec -it ceph-55633ec3-6c0c-4a02-990c-0f87e0f7a01f-mgr-osd-1 /bin/bash
[root@osd-1 /]# ls -l /usr/share/ceph/mgr
...
drwxr-xr-x. 4 root root 65 Jun 23 19:48 snap_schedule
...
The module was available before on Pacific which was also cephadm deployed.
Has anybody an idea how I can further investigate this?
Thanks again for all you help!
Best Wishes,
Mathias
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx