Am 10.06.21 um 17:45 schrieb Manuel Lausch: > Hi Peter, > > your suggestion pointed me to the right spot. > I didn't know about the feature, that ceph will read from replica > PGs. > > So on. I found two functions in the osd/PrimaryLogPG.cc: > "check_laggy" and "check_laggy_requeue". On both is first a check, if > the partners have the octopus features. if not, the function is > skipped. This explains the beginning of the problem after about the > half cluster was updated. > > To verifiy this, I added "return true" in the first line of the > functions. The issue is gone with it. But > I don't know what problems this could trigger. I know, the root cause > is not fixed with it. > I think I will open a bug ticket with this knowlage. I wonder if I faced the same issue. The issue I had occured when OSDs came back up and peering started. My cluster was a fresh octopus install so I think the min osd release was set to octopus. Is it in general safe to stay with this switch at nautilus and run octopus to run a maintained release? > > > osd_op_queue_cutoff is set to high > and a icmp rate limiting should not happen It could if you choose fast shutdown and connections to the OSD daemon are refused with icmp port unreachable?! Peter _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx