This is in part a question of *how many* of those dense OSD nodes you have. If you have a hundred of them, then most likely they’re spread across a decent number of racks and the loss of one or two is a tolerable *fraction* of the whole cluster. If you have a cluster of just, say, 3-4 of these dense nodes, component failure, network glitches, and even maintenance become problematic. You can *mostly* forestall whole-node rebalancing by careful alignment of fault domains with the value of mon_osd_down_out_subtree_limit. There are cases where it doesn’t kick in and a whole node will attempt to rebalance, which — assuming the CRUSH rules and topology are fault-tolerant — may cause surviving OSDs to reach full or backfillfull states, potentially resulting in an outage. If the limit does kick in, you’ll have reduced or no redundancy until you either bring the host/OSDs back up, or manually cause the recovery to proceed. As was already mentioned as well, having a small number of fault domains also limits the EC strategies you can safely use. > Thanks Paul. I was speaking more about total OSDs and RAM, rather than a single node. However, I am considering building a cluster with a large OSD/node count. This would be for archival use, with reduced performance and availability requirements. What issues would you anticipate with a large OSD/node count? Is the concern just the large rebalance if a node fails and takes out a large portion of the OSDs at once? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx