Re: cephadm orchestrator not responding after cluster reboot

Javier Cacheiro <Javier.Cacheiro@xxxxxxxxx> · Thu, 16 Sep 2021 15:15:32 +0200

Hi Adam, thanks a lot for your answer,

I have tried the "ceph mgr fail" and the active manager has migrated to a
different node but "ceph orch" commands continue to hang.

# ceph orch status --verbose
...
Submitting command:  {'prefix': 'orch status', 'target': ('mon-mgr', '')}
submit {"prefix": "orch status", "target": ["mon-mgr", ""]} to mon-mgr

I don't know the message system that uses to communicate with the target
mon-mgr but it seems the message never gets a response, so it makes sense
to think that something is blocked in the mgr but I do not know how to
check the mgr internals.

On Thu, 16 Sept 2021 at 14:53, Adam King <adking@xxxxxxxxxx> wrote:

> Does running "ceph mgr fail" then waiting a bit make the "ceph orch"
> commands responsive? That's worked for me sometimes before when they
> wouldn't respond.
>
> On Thu, Sep 16, 2021 at 8:08 AM Javier Cacheiro <Javier.Cacheiro@xxxxxxxxx>
> wrote:
>
>> Hi,
>>
>> I have configured a ceph cluster with the new Pacific version (16.2.4)
>> using cephadm to see how it performed.
>>
>> Everything went smoothly and the cluster was working fine until I did a
>> ordered shutdown and reboot of the nodes and after that all "ceph orch"
>> commands hang as if they were not able to contact the cephadm
>> orchestrator.
>>
>> I have seen other people experiencing a similar issue in a past thread
>> after a power outage that was resolved restarting the services at each
>> host. I have tried that but it did not work.
>>
>> As in that case, logs also give no clue and they show no errors, all
>> dockers are running fine except for rgw but I suspect those are irrelevant
>> for this case (of course I could be wrong).
>>
>> There are no data yet in the cluster, apart from tests, but I would really
>> like to find the cause of this issue but I am having a hard time figuring
>> out how "ceph orch" contacts the cephadm module of the ceph-mgr explore
>> the
>> issue in more detail. Any ideas of how to proceed are well appreciated?
>>
>> Also it would be of great help if you could direct me at where to look at
>> the code or any details about how the command line tools contacts the
>> cephamd api in the ceph-mgr.
>>
>> Thanks a lot,
>> Javier
>>
>> Here are the details:
>>
>> # ceph orch status --verbose
>> ...
>> Submitting command:  {'prefix': 'orch status', 'target': ('mon-mgr', '')}
>> submit {"prefix": "orch status", "target": ["mon-mgr", ""]} to mon-mgr
>> --> at this point hangs forever (strace shows it is blocked in futex lock)
>>
>> # ceph status
>>
>>   cluster:
>>     id:     c6e89d30-de52-11eb-a76f-bc97e1e57d70
>>     health: HEALTH_WARN
>>             8 failed cephadm daemon(s)
>>             1 pools have many more objects per pg than average
>>             pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
>> flag(s) set
>>
>>   services:
>>     mon: 2 daemons, quorum c26-1,c28-1 (age 105m)
>>     mgr: c26-1.sojetc(active, since 105m), standbys: c28-1.zmwxro,
>> c27-1.ixiiun, c28-38.lpsgmq, c26-40.ltomjc
>>     osd: 192 osds: 192 up (since 110m), 192 in (since 10w)
>>          flags
>> pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
>>
>>   data:
>>     pools:   9 pools, 7785 pgs
>>     objects: 24.69k objects, 166 GiB
>>     usage:   24 TiB used, 2.7 PiB / 2.8 PiB avail
>>     pgs:     7785 active+clean
>>
>> NOTE: Actually the cluster had 5 mons running, but in the last test I
>> started only two of them and I saw how the others appeared first as no
>> available first and then they were automatically removed from the config.
>> So even after later starting the other nodes they are no longer being used
>> as mons. Interestingly enouth they are still used as mgr.
>>
>> # ceph health detail
>> HEALTH_WARN 8 failed cephadm daemon(s); 1 pools have many more objects per
>> pg than average;
>> pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover flag(s) set
>> [WRN] CEPHADM_FAILED_DAEMON: 8 failed cephadm daemon(s)
>>     daemon rgw.cesga.c27-35.qfbwai on c27-35 is in error state
>>     daemon rgw.cesga.c27-35.eelnnx on c27-35 is in error state
>>     daemon rgw.cesga.c27-35.mihttm on c27-35 is in error state
>>     daemon rgw.cesga.c27-35.redbiq on c27-35 is in error state
>>     daemon rgw.cesga.c27-36.igdmae on c27-36 is in error state
>>     daemon rgw.cesga.c27-36.xrjhxh on c27-36 is in error state
>>     daemon rgw.cesga.c27-36.rubmyu on c27-36 is in error state
>>     daemon rgw.cesga.c27-36.swrygg on c27-36 is in error state
>> [WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than
>> average
>>     pool glance-images objects per pg (36) is more than 12 times cluster
>> average (3)
>> [WRN] OSDMAP_FLAGS:
>> pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover flag(s) set
>>
>> # ceph mgr module ls
>>     "always_on_modules": [
>>         "balancer",
>>         "crash",
>>         "devicehealth",
>>         "orchestrator",
>>         "pg_autoscaler",
>>         "progress",
>>         "rbd_support",
>>         "status",
>>         "telemetry",
>>         "volumes"
>>     ],
>>     "enabled_modules": [
>>         "cephadm",
>>         "dashboard",
>>         "iostat",
>>         "prometheus",
>>         "restful"
>>     ],
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx