While poking through one of our Nautilus clusters I noticed OSDs have HB peers that are not sharing PGs. Nautilus added OSDMap::get_random_up_osds_by_subtree() to select random OSDs of type mon_osd_reporter_subtree_level even if mon_osd_min_down_reporters is already met. If you have multiple types of hardware mapped to different pools, OSDs between these pools will HB each other which is not necessarily expected from an operations point of view. This also has the potential of wrongly marking OSDs down if one type of hardware is having issues. The more HB peers the better but couldn't we increase the default for mon_osd_min_down_reporters instead and if not met, call get_random_up_osds_by_subtree? I initially made a patch to exclude any OSD not part of the same crush root, but this wouldn't work widely since it's possible to have a crush rule spanning multiple trees, I'm not sure what other alternatives there are. Another bit from pre-nautilus, the osd id-1 and +1 are added to the HB peers, in order to have a "fully-connected set"[1]. I'm not sure I understand that comment, could somebody briefly explain how it creates a fully connected set and what set we're talking about? Thanks! [1] https://github.com/ceph/ceph/blob/master/src/osd/OSD.cc#L5141 _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx