Hi,
> But we are not sure if we can enable some of them. Now all the logs
> we have from Ceph are not showing errors. Would it help to see more
> logs to enable some of those modules?
I would not enable more modules, that could make it worse. Instead you
could try to disable diskprediction_local module. But first I would
stop those hanging containers (docker/podman stop <ID>) on all
affected hosts, then maybe restart the mgr daemons one by one and see
if this helps.
Zitat von ManuParra <mparra@xxxxxx>:
> Hi Eugen, this is the output: ceph mgr module ls
>
> {
> "always_on_modules": [
> "balancer",
> "crash",
> "devicehealth",
> "orchestrator",
> "pg_autoscaler",
> "progress",
> "rbd_support",
> "status",
> "telemetry",
> "volumes"
> ],
> "enabled_modules": [
> "cephadm",
> "dashboard",
> "diskprediction_local",
> "iostat",
> "prometheus",
> "restful"
> ],
> "disabled_modules": [
> ...
> }
>
> As you see balancer/crash/… are in section always_on. I checked it
> on all of the 3 monitor nodes with the same output.
>
> Then, checking disabled_modules I’ve seen many modules that could
> help to track some more information (logs) on our problem, like:
> - alerts
> - insights
> - test_orchestrator
> - and other…
>
> But we are not sure if we can enable some of them. Now all the logs
> we have from Ceph are not showing errors. Would it help to see more
> logs to enable some of those modules?
>
> On the other hand, as for what you indicate us of the commands that
> hang, we see that the containers that launch it remain executing,
> since they are waiting to finish. So you can see here the list of
> tests that remain running (hung):
>
>
> af13bda77a1a 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph osd status" 23 hours ago Up
> 23 hours wizardly_leavitt
> 5b5c760454c7 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph telemetry stat…" 24 hours ago Up
> 24 hours intelligent_bardeen
> a98e6061489d 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph service dump" 24 hours ago Up
> 24 hours romantic_mendel
> 66c943a032f8 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph service status" 24 hours ago Up
> 24 hours happy_shannon
> 7e18899dffc5 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph crash stat" 24 hours ago Up
> 24 hours xenodochial_germain
> 8268082e753b 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph crash ls" 24 hours ago Up
> 24 hours stoic_volhard
> fc5c434a4e23 172.16.3.146:4000/ceph/ceph:v15.2.9
> "ceph balancer status" 24 hours ago Up
> 24 hours epic_mendel
>
> So the containers will have to be eliminated.
>
> As for the logs of these containers nothing appears inside the
> container (docker logs xxxx), only, when you kill it you can see
> (--verbose):
> [ceph: root@spsrc-mon-1 /]# ceph —verbose pg stat
> ….
> validate_command: pg stat
> better match: 2.5 > 0: pg stat
> bestcmds_sorted:
> [{'flags': 8,
> 'help': 'show placement group status.',
> 'module': 'pg',
> 'perm': 'r',
> 'sig': [argdesc(<class 'ceph_argparse.CephPrefix'>, req=True,
> name=prefix, n=1, numseen=0, prefix=pg),
> argdesc(<class 'ceph_argparse.CephPrefix'>, req=True,
> name=prefix, n=1, numseen=0, prefix=stat)]}]
> Submitting command: {'prefix': 'pg stat', 'target': ('mon-mgr', '')}
> submit ['{"prefix": "pg stat", "target": ["mon-mgr", ""]}'] to mon-mgr
> [hung forever …]
>
> Kind regards,
> Manu.
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx