Hello Eugen I re-added my node but facing auth issue with the osds, on the host I can see some of the osd up and running but not showing in dashboard under osd. “inutes ago - daemon:osd.101 auth get failed: failed to find osd.101 in keyring retval: -2” # bash unit.run --> Failed to activate via raw: did not find any matching OSD to activate --> Running ceph config-key get dm-crypt/osd/b2781be2-010d-485b-82de-65f869563eaf/luks Running command: /usr/bin/ceph --cluster ceph --name client.osd-lockbox.b2781be2-010d-485b-82de-65f869563eaf --keyring /var/lib/ceph/osd/ceph-101/lockbox.keyring config-key get dm-crypt/osd/b2781be2-010d-485b-82de-65f869563eaf/luks stderr: 2025-01-26T02:48:38.010+0000 7f3caaffd640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1] stderr: 2025-01-26T02:48:38.010+0000 7f3caa7fc640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1] stderr: 2025-01-26T02:48:38.010+0000 7f3cab7fe640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1] stderr: [errno 13] RADOS permission denied (error connecting to the cluster) --> Failed to activate via LVM: Unable to retrieve dmcrypt secret --> Failed to activate via simple: 'Namespace' object has no attribute 'json_config' --> Failed to activate any OSD(s) debug 2025-01-26T02:48:38.506+0000 7f1bccacc640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] debug 2025-01-26T02:48:38.506+0000 7f1bcd2cd640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] debug 2025-01-26T02:48:41.506+0000 7f1bcd2cd640 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2] failed to fetch mon config (--no-mon-config to skip) But if failure domain is host level, if we remove a host and osds, should it not recover? Regards Dev > On Jan 25, 2025, at 1:34 PM, Eugen Block <eblock@xxxxxx> wrote: > > Hi, > >>> But now issue is, my cluster showing objects misplaced, whereas I had 5 nodes with host failure domain with R3 pool (size 3 and min 2), EC with 3+2. > > the math is pretty straight forward, with 5 chunks (k3, m2) you need (at least) 5 hosts. So you should add the host back to be able to recover. I would even suggest to add two more hosts so you can sustain the failure of one entire host. > There are ways to recover in the current state (change failure domain to OSD via crush rule), but I really don't recommend that, I just want to add it for the sake of completeness. I strongly suggest to re-add the fifth host (and think about adding a sixth). > > Regards, > Eugen > > Zitat von Devender Singh <devender@xxxxxxxxxx <mailto:devender@xxxxxxxxxx>>: > >> +Eugen >> Lets follow “No recovery after removing node - active+undersized+degraded-- removed osd using purge…”. Here. >> >> Sorry I missed ceph version which is 18.2.4. (with 5 nodes, 22osd each, where I removed one node and all mess.) >> >> Regards >> Dev >> >> >> >>> On Jan 25, 2025, at 11:34 AM, Devender Singh <devender@xxxxxxxxxx> wrote: >>> >>> Hello Fredreic >>> >>> >>> Thanks for your reply, Yes I also faced this issue after draining and removing of the node. >>> So used the same command and remove “original_weight” using ceph config-key get mgr/cephadm/osd_remove_queue and injected file again. Which resolved the orch issue. >>> >>> “Error ENOENT: Module not found - ceph orch commands stoppd working >>> >>> ceph config-key get mgr/cephadm/osd_remove_queue > osd_remove_queue.json >>> >>> Then only remove the "original_weight" key from that json and upload it back to the config-key store: >>> >>> ceph config-key set mgr/cephadm/osd_remove_queue -i osd_remove_queue_modified.json >>> >>> Then fail the mgr: >>> >>> ceph mgr fail” >>> >>> >>> But now issue is, my cluster showing objects misplaced, whereas I had 5 nodes with host failure domain with R3 pool (size 3 and min 2), EC with 3+2. >>> >>> # ceph -s >>> cluster: >>> id: 384d7590-d018-11ee-b74c-5b2acfe0b35c >>> health: HEALTH_WARN >>> Degraded data redundancy: 2848547/29106793 objects degraded (9.787%), 105 pgs degraded, 132 pgs undersized >>> >>> services: >>> mon: 4 daemons, quorum node1,node5,node4,node2 (age 12h) >>> mgr: node1.cvknae(active, since 12h), standbys: node4.foomun >>> mds: 2/2 daemons up, 2 standby >>> osd: 95 osds: 95 up (since 16h), 95 in (since 21h); 124 remapped pgs >>> rgw: 2 daemons active (2 hosts, 1 zones) >>> >>> data: >>> volumes: 2/2 healthy >>> pools: 18 pools, 817 pgs >>> objects: 6.06M objects, 20 TiB >>> usage: 30 TiB used, 302 TiB / 332 TiB avail >>> pgs: 2848547/29106793 objects degraded (9.787%) >>> 2617833/29106793 objects misplaced (8.994%) >>> 561 active+clean >>> 124 active+clean+remapped >>> 105 active+undersized+degraded >>> 27 active+undersized >>> >>> io: >>> client: 1.4 MiB/s rd, 4.0 MiB/s wr, 25 op/s rd, 545 op/s wr >>> >>> And when using 'ceph config-key ls’ it’s showing old node and osd’s. >>> >>> # ceph config-key ls|grep -i 03n >>> "config-history/135/+osd/host:node3/osd_memory_target", >>> "config-history/14990/+osd/host:node3/osd_memory_target", >>> "config-history/14990/-osd/host:node3/osd_memory_target", >>> "config-history/15003/+osd/host:node3/osd_memory_target", >>> "config-history/15003/-osd/host:node3/osd_memory_target", >>> "config-history/15016/+osd/host:node3/osd_memory_target", >>> "config-history/15016/-osd/host:node3/osd_memory_target", >>> "config-history/15017/+osd/host:node3/osd_memory_target", >>> "config-history/15017/-osd/host:node3/osd_memory_target", >>> "config-history/15022/+osd/host:node3/osd_memory_target", >>> "config-history/15022/-osd/host:node3/osd_memory_target", >>> "config-history/15024/+osd/host:node3/osd_memory_target", >>> "config-history/15024/-osd/host:node3/osd_memory_target", >>> "config-history/15025/+osd/host:node3/osd_memory_target", >>> "config-history/15025/-osd/host:node3/osd_memory_target", >>> "config-history/153/+osd/host:node3/osd_memory_target", >>> "config-history/153/-osd/host:node3/osd_memory_target", >>> "config-history/165/+mon.node3/container_image", >>> "config-history/171/-mon.node3/container_image", >>> "config-history/176/+client.crash.node3/container_image", >>> "config-history/182/-client.crash.node3/container_image", >>> "config-history/4276/+osd/host:node3/osd_memory_target", >>> "config-history/4276/-osd/host:node3/osd_memory_target", >>> "config-history/433/+client.ceph-exporter.node3/container_image", >>> "config-history/439/-client.ceph-exporter.node3/container_image", >>> "config-history/459/+osd/host:node3/osd_memory_target", >>> "config-history/459/-osd/host:node3/osd_memory_target", >>> "config-history/465/+osd/host:node3/osd_memory_target", >>> "config-history/465/-osd/host:node3/osd_memory_target", >>> "config-history/4867/+osd/host:node3/osd_memory_target", >>> "config-history/4867/-osd/host:node3/osd_memory_target", >>> "config-history/4878/+mon.node3/container_image", >>> "config-history/4884/-mon.node3/container_image", >>> "config-history/4889/+client.crash.node3/container_image", >>> "config-history/4895/-client.crash.node3/container_image", >>> "config-history/5139/+mds.k8s-dev-cephfs.node3.iebxqn/container_image", >>> "config-history/5142/-mds.k8s-dev-cephfs.node3.iebxqn/container_image", >>> "config-history/5150/+client.ceph-exporter.node3/container_image", >>> "config-history/5156/-client.ceph-exporter.node3/container_image", >>> "config-history/5179/+osd/host:node3/osd_memory_target", >>> "config-history/5179/-osd/host:node3/osd_memory_target", >>> "config-history/5183/+client.rgw.sea-dev.node3.betyqd/rgw_frontends", >>> "config-history/5189/+osd/host:node3/osd_memory_target", >>> "config-history/5189/-osd/host:node3/osd_memory_target", >>> "config-history/6929/-client.rgw.sea-dev.node3.betyqd/rgw_frontends", >>> "config-history/6933/+osd/host:node3/osd_memory_target", >>> "config-history/6933/-osd/host:node3/osd_memory_target", >>> "config-history/9710/+osd/host:node3/osd_memory_target", >>> "config-history/9710/-osd/host:node3/osd_memory_target", >>> "config/osd/host:node3/osd_memory_target”, >>> >>> >>> Regards >>> Dev >>> >>>> On Jan 25, 2025, at 4:39 AM, Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> wrote: >>>> >>>> Hi, >>>> >>>> I've seen this happening on a test cluster after draining a host that also had a MGR service. Can you check if Eugen's solution here [1] helps in your case ? And maybe investigate 'ceph config-key ls' for any issues in config keys ? >>>> >>>> Regards, >>>> Frédéric. >>>> >>>> [1] https://www.google.com/url?q=https://www.spinics.net/lists/ceph-users/msg83667.html&source=gmail-imap&ust=1738445690000000&usg=AOvVaw15NWLxIRc3boBpYf4URpvo <https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.spinics.net/lists/ceph-users/msg83667.html%26source%3Dgmail-imap%26ust%3D1738413580000000%26usg%3DAOvVaw3Zk70LrQ6SrLX02gJ7Cowl&source=gmail-imap&ust=1738445690000000&usg=AOvVaw059BufjCpNI3NhOIPfBdFy> >>>> >>>> De : Devender Singh <devender@xxxxxxxxxx <mailto:devender@xxxxxxxxxx>> >>>> Envoyé : samedi 25 janvier 2025 06:27 >>>> À : Fnu Virender Kumar >>>> Cc: ceph-users >>>> Objet : Re: Error ENOENT: Module not found >>>> >>>> Thanks for you reply… but those command not working as its an always module..but strange still showing error, >>>> >>>> # ceph mgr module enable orchestrator >>>> module 'orchestrator' is already enabled (always-on) >>>> >>>> # ceph orch set backend — returns successfully… >>>> >>>> # # ceph orch ls >>>> Error ENOENT: No orchestrator configured (try `ceph orch set backend`) >>>> >>>> Its revolving between same error.. >>>> >>>> Root Cause: I removed a hosts and its odd’s and after some time above error started automatically. >>>> >>>> Earlier in the had 5 nodes but now 4.. Cluster is showing unclean pg but not doing anything.. >>>> >>>> But big error is Error ENOENT: >>>> >>>> >>>> Regards >>>> Dev >>>> >>>> > On Jan 24, 2025, at 4:59 PM, Fnu Virender Kumar <virenderk@xxxxxxxxxxxx <mailto:virenderk@xxxxxxxxxxxx>> wrote: >>>> > >>>> > Did you try >>>> > >>>> > Ceph mgr module enable orchestrator >>>> > Ceph orch set backend >>>> > Ceph orch ls >>>> > >>>> > Check the mgr service daemon as well >>>> > Ceph -s >>>> > >>>> > >>>> > Regards >>>> > Virender >>>> > From: Devender Singh <devender@xxxxxxxxxx <mailto:devender@xxxxxxxxxx>> >>>> > Sent: Friday, January 24, 2025 6:34:43 PM >>>> > To: ceph-users <ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>> >>>> > Subject: Error ENOENT: Module not found >>>> > >>>> > >>>> > Hello all >>>> > >>>> > Any quick fix for … >>>> > >>>> > root@sea-devnode1:~# ceph orch ls >>>> > Error ENOENT: Module not found >>>> > >>>> > >>>> > Regards >>>> > Dev >>>> > _______________________________________________ >>>> > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> >>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx