Agreed. I’ve been in a situation where I wasn’t able to stock spare chassis because of ostensible 2-hour turnaround from vendor support. Which was actually 2 hours *from when they agreed to replace/repair*, and if the local depot happened to have parts. One of my clusters as a result was down 2 mons for >wince< 18 months. Hopefully most production deployments will have better strategies, but it does reinforce Frank’s point. Beyond 5 one is into diminishing returns. At an OpenStack Summit a handful of years ago a number of Ceph operators gathered, and mon count was discussed. The consensus was that more than 5 was overkill at best, and that the increased inter-mon traffic wasn’t worth it. ymmv. > On Jan 27, 2022, at 5:34 AM, Frank Schilder <frans@xxxxxx> wrote: > > In addition to the workload distribution, the number of mons will also determine how resilliant your system is towards admin mistakes. With 5 mons, you can do service on 1 and loose and additional one without loosing service. I have been there and would not operate a production cluster with less than 5 MONs. > > The level of redundancy decides in how deep trouble you can allow yourself to get into without loosing service. Minimum requirements only work well in nice weather. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx