Re: 14.2.22 dashboard periodically dies and didn't failover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 13.01.22 um 08:37 schrieb Szabo, Istvan (Agoda):
> Hi,
>
> I can see a lot of message regarding the rotating key, but not sure this is the root cause.
>
> 2022-01-13 03:21:57.156 7fe7e085e700 -1 monclient: _check_auth_rotating possible clock skew, rotating keys expired way too early (before 2022-01-13 02:21:57.156836)
> 2022-01-13 03:22:01.484 7fe7e2862700 -1 received  signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror  (PID: 1572574) UID: 0
>
> I have 3 mon with 3 mgr and on al mgr the dashboard installed.
>
> When the mgr dies on the first node, it didn't failover to the other 2, only the service restart can solve the issue.
>
> Any idea?


We have seen a similar issue starting with 14.2.22. We have a slightly different situation. The mgr gets stuck and the cluster elects another mgr as primary, but

the original primary does not recover. The process is stuck. I have a (large) backtrace if someone is interested.

For us it seems that the prometheus exporter module is the cause. Do you have it enabled?


Peter



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux