On 17/01/2024 16:11, kefu chai wrote:
On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer <chris.palmer@xxxxxxxxx>
wrote:
Updates on both problems:
Problem 1
--------------
The bookworm/reef cephadm package needs updating to accommodate
the last
change in /usr/share/doc/adduser/NEWS.Debian.gz:
System user home defaults to /nonexistent if --home is not
specified.
Packages that call adduser to create system accounts should
explicitly
specify a location for /home (see Lintian check
maintainer-script-lacks-home-in-adduser).
i.e. when creating the cephadm user as a system user it needs to
explicitly specify the expected home directory of /home/cephadm.
Hi Chris, thank you for the bug report and the suggestion. could you
please
file a tracker ticket, so we can track and backport the related fixes?
i just
created https://github.com/ceph/ceph/pull/55218 in hope to alleviate the
problem.
I've created issue https://tracker.ceph.com/issues/64069 for this.
A workaround is to manually create the user+directory before
installing
ceph.
Problem 2
--------------
This is a complex set of interactions that prevent many mgr modules
(including dashboard) from running. It is NOT debian-specific and
will
eventually bite other distributions as well. At the moment Ceph
PR54710
looks the most promising fix (full or partial). Detail is spread
across
the following:
https://github.com/pyca/cryptography/issues/9016
https://github.com/ceph/ceph/pull/54710
https://tracker.ceph.com/issues/63529
https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
https://github.com/pyca/bcrypt/issues/694
IIUC, a backport of https://github.com/ceph/ceph/pull/54710 to reef
would address this issue, am i right?
Unfortunately I think this may be part of a much bigger MGR problem. My
understanding of the relevant background is:
* MGR modules use python subinterpreters for isolation between modules.
* Several modules (including but not limited to dashboard & restful)
use python3-cryptography for hashing and TLS (and possibly other
things).
* python3-cryptography delegates some crypto functions to Rust
functions. These include bcrypt and TLS-related functions.
* python3-cryptography uses PyO3 to invoke Rust functions.
* PyO3 does not support being used by subinterpreters. In the past
this has been allowed but was actually unsafe. Now PyO3 throws an
exception when it detects multiple initialisations.
So it appears that the MGR use of these functions has always been
unsafe, and is now forbidden.
PR54710 identified that the code necessary for the bcrypt hashing used
during authentication could easily be written in a small amount of
native python, thus avoiding the whole PyO3 area altogether. However
there was a note in the discussions that you also had to disable TLS.
And it only applied to the dashboard. My stacktrace below shows the
exception during TLS initialisation.
As PyO3 updates are adopted in other linux distributions this is likely
to break a number of MGR modules. As there does not seem to be any
subinterpreter support in PyO3 coming soon, the only option may be to
completely eliminate use of python3-cryptopgraphy from all MGR modules.
(It is possible MGR modules may also use other python3 modules that use
PyO3 to invoke Rust).
Unfortunately for us, we didn't find this until we had upgraded all MONs
in a cluster to reef, at which point we can't downgrade them to quincy.
And we can't upgrade the MGR. As a temporary measure (this cluster had
MON/MGR/MDS/RGW colocated on 2 hosts) we have added another bookworm
host running a reef MON to ensure we can maintain quorum. We are not
sure whether it is safe to upgrade the other components (OSD, MDS, RGW)
while the MGR remains at quincy,
🙁
On 12/01/2024 14:29, Chris Palmer wrote:
> More info on problem 2:
>
> When starting the dashboard, the mgr seems to try to initialise
> cephadm, which in turn uses python crypto libraries that lead to
the
> python error:
>
> $ ceph crash info
> 2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52
> {
> "backtrace": [
> " File \"/usr/share/ceph/mgr/cephadm/__init__.py\",
line 1,
> in <module>\n from .module import CephadmOrchestrator",
> " File \"/usr/share/ceph/mgr/cephadm/module.py\", line
15, in
> <module>\n from cephadm.service_discovery import
ServiceDiscovery",
> " File
\"/usr/share/ceph/mgr/cephadm/service_discovery.py\",
> line 20, in <module>\n from cephadm.ssl_cert_utils import
SSLCerts",
> " File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\",
> line 8, in <module>\n from cryptography import x509",
> " File
> \"/lib/python3/dist-packages/cryptography/x509/__init__.py\",
line 6,
> in <module>\n from cryptography.x509 import
certificate_transparency",
> " File
>
\"/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py\",
> line 10, in <module>\n from cryptography.hazmat.bindings._rust
> import x509 as rust_x509",
> "ImportError: PyO3 modules may only be initialized once per
> interpreter process"
> ],
> "ceph_version": "18.2.1",
> "crash_id":
> "2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52",
> "entity_name": "mgr.xxxxx01",
> "mgr_module": "cephadm",
> "mgr_module_caller": "PyModule::load_subclass_of",
> "mgr_python_exception": "ImportError",
> "os_id": "12",
> "os_name": "Debian GNU/Linux 12 (bookworm)",
> "os_version": "12 (bookworm)",
> "os_version_id": "12",
> "process_name": "ceph-mgr",
> "stack_sig":
> "7815ad73ced094695056319d1241bf7847da19b4b0dfee7a216407b59a7e3d84",
> "timestamp": "2024-01-12T11:10:03.938478Z",
> "utsname_hostname": "xxxxx01.xxx.xxx",
> "utsname_machine": "x86_64",
> "utsname_release": "6.1.0-17-amd64",
> "utsname_sysname": "Linux",
> "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1
> (2023-12-30)"
> }
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx