Re: Ceph-mgr wont start, cant find rook module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jeff,

Ran into the very same issue.  Filed a bug-report at https://tracker.ceph.com/issues/45574
ceph 15.2.1 on up-to-date Debian Buster.

TL;DR: The way ceph-mgr-rook's RookOrchestrator class interacts with
python3-numpy package is borked.
Result: Cluster cannot start, since 'devicehealth' plugin of
ceph-mgr is an always-on module.

Best,
Martin

On Mon, May 04, 2020 at 02:10:58AM -0700, Jeff Welling wrote:
> Hello my ceph-using comrades!
> 
> I've been using ceph for awhile at home but wanted to update to the
> latest, Octopus. I got it installed on a single node, added a second
> node and some OSDs, and have been migrating from my original Jewel
> cluster. When I installed Octopus, I wiped the systems and installed
> Debian Buster, added the ceph apt repos instead of using the packages in
> Debian, and installed manually using ceph-vol to create Bluestore OSDs,
> using the ceph docs as my guide.
> 
> Now though, one of the two Octopus nodes (the one running ceph-mgr and
> ceph-mon) are crashing weekly. I haven't been able to look into the
> cause of the crashes in detail yet as these are hobbyist systems and
> work has been exceptionally busy lately, but now after the most recent
> crash, I'm unable to start ceph-mgr and the syslog has ceph-mgr messages
> complaining of not being able to find the 'rook' module. This is rather
> confusing because though I'm aware of rook, to my knowledge I've never
> used it on my systems, and there's no mention of it in my config.
> 
> I tried applying pending upgrades but that hasn't changed the behavior.
> 
> I normally wouldn't dare ask for help this early in my adventure but I
> find myself in a bit of a pinch. By any chance have you hit this before,
> or know what may be causing it?
> 
> 
> Ceph is awesome. Keep up the good work, stay safe, and Thank You Kindly
> in advance!
> 
> 
> 
> My ceph version
> 
>     root@zim:~# ceph --version
>     ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee)
>     octopus (stable)
> 
> 
> This is my ceph.config
> 
>     [global]
>         fsid = 495d7f30-CCCC-BBBB-AAAA-ddf6ffe063d0
>         mon initial members = zim.internal.justdev.ca
>         mon host = 192.168.0.11
>         public network = 192.168.0.0/24
>         cluster network = 192.168.42.0/24
>         auth cluster required = cephx
>         auth service required = cephx
>         auth client required = cephx
>         osd journal size = 1024
>         osd pool default size = 3
>         osd pool default min size = 2
>         osd pool default pg num = 333
>         osd pool default pgp num = 333
>         osd crush chooseleaf type = 1
>         rbd_default_features = 7
> 
> 
> These are the syslogs that show up when trying to restart ceph-mgr
> 
>     May  4 01:35:13 zim ceph-mgr[21602]: 2020-05-04T01:35:13.065-0700
>     7fcdccaa2f40 -1 mgr[py] Module not found: 'rook'
>     May  4 01:35:13 zim ceph-mgr[21602]: 2020-05-04T01:35:13.065-0700
>     7fcdccaa2f40 -1 mgr[py] Traceback (most recent call last):
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/usr/share/ceph/mgr/rook/__init__.py", line 2, in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from .module import
>     RookOrchestrator
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/usr/share/ceph/mgr/rook/module.py", line 16, in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from kubernetes import
>     client, config
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/kubernetes/__init__.py", line 22, in
>     <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     import kubernetes.stream
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/kubernetes/stream/__init__.py", line 15,
>     in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from .stream import stream
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/kubernetes/stream/stream.py", line 13,
>     in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from . import ws_client
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/kubernetes/stream/ws_client.py", line
>     19, in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from websocket import
>     WebSocket, ABNF, enableTrace
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/websocket/__init__.py", line 22, in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from ._abnf import *
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/websocket/_abnf.py", line 34, in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     import numpy
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/numpy/__init__.py", line 142, in <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from . import core
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/numpy/core/__init__.py", line 40, in
>     <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from . import multiarray
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/numpy/core/multiarray.py", line 12, in
>     <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     from . import overrides
>     May  4 01:35:13 zim ceph-mgr[21602]:   File
>     "/lib/python3/dist-packages/numpy/core/overrides.py", line 65, in
>     <module>
>     May  4 01:35:13 zim ceph-mgr[21602]:     """)
>     May  4 01:35:13 zim ceph-mgr[21602]: RuntimeError:
>     _get_implementing_args method already has a docstring
>     May  4 01:35:13 zim ceph-mgr[21602]: 2020-05-04T01:35:13.069-0700
>     7fcdccaa2f40 -1 mgr[py] Class not found in module 'rook'
>     May  4 01:35:13 zim ceph-mgr[21602]: 2020-05-04T01:35:13.069-0700
>     7fcdccaa2f40 -1 mgr[py] Error loading module 'rook': (2) No such
>     file or directory
>     May  4 01:35:13 zim ceph-mgr[21602]: 2020-05-04T01:35:13.673-0700
>     7fcdccaa2f40 -1 log_channel(cluster) log [ERR] : Failed to load
>     ceph-mgr modules: rook
> 
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

Attachment: signature.asc
Description: PGP signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux