Re: Network Flapping Causing Slow Ops and Freezing VMs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

just to get a better understanding, when you write

Although the OSDs were correctly marked as down in the monitor, slow ops persisted until we resolved the network issue.

do you mean that the MONs marked the OSDs as down (temporarily) or did you do that? Because if the OSDs "flap" they would also mark themselves "up" all the time, this should be reflected in the OSD logs, something like "wrongly marked me down". Can you confirm that the daemons were still up and logged the "wrongly marked me down" messages? In some cases the "nodown" flag can prevent flapping OSDs, but since you actually had a network issue it wouldn't really help here. I would probably have set the noout flag and stop the OSD daemons on the affected node until the issue was resolved.

Regards,
Eugen

Zitat von mahnoosh shahidi <mahnooosh.shd@xxxxxxxxx>:

Hi all,

I hope this message finds you well. We recently encountered an issue on one
of our OSD servers, leading to network flapping and subsequently causing
significant performance degradation across our entire cluster. Although the
OSDs were correctly marked as down in the monitor, slow ops persisted until
we resolved the network issue. This incident resulted in a major
disruption, especially affecting VMs with mapped RBD images, leading to
their freezing.

In light of this, I have two key questions for the community:

1. Why did slow ops persist even after marking the affected server as down
in the monitor?

2.Are there any recommended configurations for OSD suicide or OSD down
reports that could help us better handle similar network-related issues in
the future?

Best Regards,
Mahnoosh
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux