Aha, thank you Adam. That will be good.
For completeness, I just tested on Squid 19.2.0. The problem is there
too, but it manifests itself differently from Reef. It doesn't generate
crashes. When the MGR becomes active the health status immediately
changes to WARN "Module 'dashboard' has failed dependency: No module
name 'cherrypy.wsgiserver'". The warning disappears when the affected
MGR becomes standby (and an unpatched one becomes active).
On 19/11/2024 14:15, Adam King wrote:
Given the reference to that cherrypy backports stuff in the traceback,
I'll just mention we are in the process of removing that from the code
as we've seen issues with it in our testing as well
(https://github.com/ceph/ceph/pull/60602 /
https://tracker.ceph.com/issues/68802). We want that patch in squid,
reef, and quincy so FWIW the next release of each of those branches
shouldn't have this issue any more.
On Tue, Nov 19, 2024 at 8:15 AM Chris Palmer <chris.palmer@xxxxxxxxx>
wrote:
I've just applied routine Centos 9 updates to one node of a Reef
18.2.4
system (package install). They include some python3 updates that
break
the MGR in at least two ways.
When the MGR starts in standby, it immediately logs the following two
crashes. Only the dashboard and restful services are enabled
(prometheus
is not enabled). We have mgr/dashboard/FEATURE_TOGGLE_DASHBOARD
= false.
{
"archived": "2024-11-19 10:30:46.031489",
"backtrace": [
" File \"/usr/share/ceph/mgr/dashboard/__init__.py\",
line 60,
in <module>\n from .module import Module, StandbyModule #
noqa: F401",
" File \"/usr/share/ceph/mgr/dashboard/module.py\", line
51,
in <module>\n patch_cherrypy(cherrypy.__version__)",
" File
\"/usr/share/ceph/mgr/dashboard/cherrypy_backports.py\", line 197, in
patch_cherrypy\n accept_exceptions_from_builtin_ssl(ver)",
" File
\"/usr/share/ceph/mgr/dashboard/cherrypy_backports.py\", line 113, in
accept_exceptions_from_builtin_ssl\n patch_builtin_ssl_wrap(v,
accept_ssl_errors)",
" File
\"/usr/share/ceph/mgr/dashboard/cherrypy_backports.py\", line 75, in
patch_builtin_ssl_wrap\n from cherrypy.wsgiserver.ssl_builtin
import
BuiltinSSLAdapter as builtin_ssl",
"ModuleNotFoundError: No module named 'cherrypy.wsgiserver'"
],
"ceph_version": "18.2.4",
"crash_id":
"2024-11-19T09:19:40.427015Z_f96f4d70-2112-47b5-96f7-5e9bd463e8eb",
"entity_name": "mgr.ceph1",
"mgr_module": "dashboard",
"mgr_module_caller": "PyModule::load_subclass_of",
"mgr_python_exception": "ModuleNotFoundError",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "9",
"os_version_id": "9",
"process_name": "ceph-mgr",
"stack_sig":
"9c98ebf1b6831bfca2823f54c9e6be01306090c2a7749def2ed8a15167fc527a",
"timestamp": "2024-11-19T09:19:40.427015Z",
"utsname_hostname": "ceph1.xxxxxx",
"utsname_machine": "x86_64",
"utsname_release": "6.1.112-1.el9.elrepo.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP PREEMPT_DYNAMIC Mon Sep 30
13:59:36 EDT
2024"
}
{
"archived": "2024-11-19 10:30:46.056007",
"backtrace": [
" File \"/usr/share/ceph/mgr/prometheus/__init__.py\",
line 2,
in <module>\n from .module import Module, StandbyModule",
" File \"/usr/share/ceph/mgr/prometheus/module.py\",
line 38,
in <module>\n v = Version(cherrypy.__version__)",
" File
\"/lib/python3.9/site-packages/pkg_resources/_vendor/packaging/version.py\",
line 277, in __init__\n raise InvalidVersion(\"Invalid version:
'{0}'\".format(version))",
"pkg_resources.extern.packaging.version.InvalidVersion: Invalid
version: 'unknown'"
],
"ceph_version": "18.2.4",
"crash_id":
"2024-11-19T09:19:42.188291Z_c112de10-cdd5-4ed3-86b5-04dffd660cb8",
"entity_name": "mgr.ceph1",
"mgr_module": "prometheus",
"mgr_module_caller": "PyModule::load_subclass_of",
"mgr_python_exception": "InvalidVersion",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "9",
"os_version_id": "9",
"process_name": "ceph-mgr",
"stack_sig":
"7fb0c6c17573887e8772c311aac1f3547c8366e50d6f2f8d8bd38deb6e0e9405",
"timestamp": "2024-11-19T09:19:42.188291Z",
"utsname_hostname": "ceph1.xxxxxx",
"utsname_machine": "x86_64",
"utsname_release": "6.1.112-1.el9.elrepo.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP PREEMPT_DYNAMIC Mon Sep 30
13:59:36 EDT
2024"
}
The MGR can be made active, but attempts to use the dashboard
generate
more of these.
The culprit turns out to be python3-jaraco-text from the epel
repository, which upgraded from
python3-jaraco-text-3.2.0-6.el9.noarch
to python3-jaraco-text-4.0.0-2.el9.noarch. (It also installed
python3-jaraco-context-6.0.1-3.el9.noarch as a new dependency).
Reverting to python3-jaraco-text-3.2.0 has avoided the problem for
now,
but that's not a sustainable long-term fix.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx