Osds going down/flapping after Luminous to Nautilus upgrade part 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This 2nd post is about the next type of flapping osds we encountered after upgrading. We started to see osds going down with this in 'ceph -w':

2024-08-01 12:02:57.437135 mon.cat-hlz-stor001 [INF] osd.479 marked down after no beacon for 902.637005 seconds 2024-08-01 12:02:57.468372 mon.cat-hlz-stor001 [WRN] Health check failed: 1 osds down (OSD_DOWN)

We have the beacon interval set to 300. To fix this we tried:

- restarting osds
- restarting mons
- ntp tidyup
- restarting mgrs

However it is still happening.  Poking around in the osd and mon logs we did see some lines that hinted that the mon might be listening for beacons using v1 - which could be broken (see part 1). Hence restarting them again. This did not have any effect.

Apart from enabling the v2 msgr we have not altered our Luminous config for Nautilus, are we missing something?

Regards

Mark
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux