Re: After upgrading from 17.2.6 to 18.2.0, OSDs are very frequently restarting due to livenessprobe failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If there is nothing obvious in the OSD logs such as failing to start, and
if the OSDs appear to be running until the liveness probe restarts them,
you could disable or change the timeouts on the liveness probe. See
https://rook.io/docs/rook/latest/CRDs/Cluster/ceph-cluster-crd/#health-settings
.

But of course, we need to understand if there is some issue with the OSDs.
Please open a Rook issue if it appears related to the liveness probe.

Travis

On Thu, Sep 21, 2023 at 3:12 AM Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:

> Hi!
>
> Can you share OSD logs demostrating such a restart?
>
>
> Thanks,
>
> Igor
>
> On 20/09/2023 20:16, sbengeri@xxxxxxxxx wrote:
> > Since upgrading to 18.2.0 , OSDs are very frequently restarting due to
> livenessprobe failures making the cluster unusable. Has anyone else seen
> this behavior?
> >
> > Upgrade path: ceph 17.2.6 to 18.2.0 (and rook from 1.11.9 to 1.12.1)
> > on ubuntu 20.04 kernel 5.15.0-79-generic
> >
> > Thanks.
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux