On Fri, Aug 28, 2015 at 3:53 PM, Tony Nelson <tnelson@xxxxxxxxxxxxx> wrote: > I recently built a 3 node Proxmox cluster for my office. I’d like to get HA > setup, and the Proxmox book recommends Ceph. I’ve been reading the > documentation and watching videos, and I think I have a grasp on the basics, > but I don’t need anywhere near a petabyte of storage. > > > > I’m considering servers w/ 12 drive bays, 2 SDD mirrored for the OS, 2 SDDs > for journals and the other 8 for OSDs. I was going to purchase 3 identical > servers, and use my 3 Proxmox servers as the monitors, with of course GB > networking in between. Obviously this is very vague, but I’m just getting > started on the research. > > My concern is that I won’t have enough physical disks, and therefore I’ll > end up with performance issues. That's impossible to know without knowing what kind of performance you need. > I’ve seen many petabyte+ builds discussed, but not much on the smaller side. > Does anyone have any guides or reference material I may have missed? The practicalities of fault tolerance are very different in a minimum-size system (e.g. 3 servers configured for 3 replicas). * when one disk fails, the default rules require that the only place ceph can re-replicate the PGs that were on that disk is to other disks on the same server where the failure occurred. One full disk's worth of data will have to flow into the server where the failure occurred, preferably quite fast (to avoid the risk of a double failure). Recovering from a 2TB disk failure will take as long as it takes to stream that much data over your 1gbps link. Your recovery time will be similar to conventional RAID, unless you install a faster network. * when one server fails, you're losing a full third of your bandwidth. That means that your client workloads would have to be sized to usually only use 2/3 of the theoretical bandwidth, or that you would have to shut down some workloads when a server failed. In larger systems this isn't such a worry as losing 1 of 32 servers is only a 3% throughput loss. You should compare the price of getting the same amount of disk+ram, but spread it between twice as many servers. The other option is of course a traditional dual ported raid controller. John _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com