Hi Frank, Seems you are hitting the balancer bug in 19.2 common for larger pg numbers (the same one mentioned in the tracker). There is a fix making its way through final(?) stages of 19.2.1 release. Unfortunately the only current option is to keep the balancer off and wait for 19.2.1 to arrive. We managed our way so far with manual/cron balancing using: https://github.com/laimis9133/plankton-swarm (our own swissknife) https://github.com/TheJJ/ceph-balancer With a some amount of https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py Adding Zac directly here to bring attention once more to the issue: Users attempting to upgrade to 19.2.0 should be aware of possible balancer issues in the documentation here: https://docs.ceph.com/en/latest/releases/squid/#v19-2-0-squid Best, Laimis J. > On 6 Jan 2025, at 21:58, Frank Frampton <Frank.Frampton@xxxxxxxxxxxxxx> wrote: > > Recent upgrade from 18.2 to 19.2, upgrade went fine. Since the upgrade and a manager fail over, I can no longer run orchestrator commands. The only error I can find on an active manager daemon is the following, or it is the only one that stands out. > > 2025-01-06T18:48:41.698+0000 7fcf42b99640 -1 mgr load Failed to construct class in 'cephadm' > 2025-01-06T18:48:41.698+0000 7fcf42b99640 -1 mgr load Traceback (most recent call last): > File "/usr/share/ceph/mgr/cephadm/module.py", line 667, in __init__ > self.cert_key_store.load() > File "/usr/share/ceph/mgr/cephadm/inventory.py", line 2073, in load > self.known_certs[entity] = json.loads(v) > File "/lib64/python3.9/json/__init__.py", line 346, in loads > return _default_decoder.decode(s) > File "/lib64/python3.9/json/decoder.py", line 337, in decode > obj, end = self.raw_decode(s, idx=_w(s, 0).end()) > File "/lib64/python3.9/json/decoder.py", line 355, in raw_decode > raise JSONDecodeError("Expecting value", s, err.value) from None > json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) > > 2025-01-06T18:48:41.702+0000 7fcf42b99640 -1 mgr operator() Failed to run module in active mode ('cephadm') > > For any thing to really work in the dashboard I must have the balancer off. While the balance is off I can make changes in dashboard to the orchestrator, and it doesn't give me any trouble. When trying different commands from a ceph node say "ceph cephadm config-check status" it returns "Error ENOTSUP: Module 'cephadm' is not enabled/loaded (required by command 'cephadm config-check status'): use `ceph mgr module enable cephadm` to enable it". Running "ceph mgr module enable cephadm" returns "module 'cephadm' is already enabled". I really don't know where to look or what to try to resolve this. Any "ceph orch" command results in "Error ENOENT: Module not found" > > I don't know that my issue is related to https://www.google.com/url?q=https://tracker.ceph.com/issues/68657&source=gmail-imap&ust=1737301236000000&usg=AOvVaw0nqZZA6yeIwXzuDNuhhkx0, but maybe it is. > > I have tried the following. > Manually adding a new mgr daemon on different node, it starts runs the dashboard fine, but things are still not functional. > Failed the mgr several times. > Disabled/Enabled balancer. > Disabled/Enabled mgr modules. > Disabled/Enabled dashboard. > > All physical nodes are running Debian 12. > > > > > Frank Frampton > Senior Network Services Administrator > Salt Lake City School District > Desk: (801) 578-8223 > Follow the district: Facebook<https://www.google.com/url?q=https://www.facebook.com/slcschools&source=gmail-imap&ust=1737301236000000&usg=AOvVaw2u9ap9uRc5kKtF41UFoQ4K> | Instagram<https://www.google.com/url?q=https://instagram.com/slcschools&source=gmail-imap&ust=1737301236000000&usg=AOvVaw2gmnqAAHLKCah05bohs1Aa> | Twitter<https://www.google.com/url?q=https://twitter.com/slcschools&source=gmail-imap&ust=1737301236000000&usg=AOvVaw2_9Yi-af1xrk8_QE1PG4TB> | www.slcschools.org<https://www.google.com/url?q=http://www.slcschools.org/&source=gmail-imap&ust=1737301236000000&usg=AOvVaw3GCyIAzd4YooWiIMbXF6ZG> > Excellence and Equity: every student, every classroom, every day > Scanned By Microsoft EOP > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx