Hi Guys We are using Ceph in production backing an LXC cluster. The setup is : 2 x Servers, 24 x 3TB Disks each in groups of 3 as RAID0. SSD for journals. Bonded 1gbit ethernet (2gbit total). Overnight we have had a disk failure, this in itself is not a biggie – but due to the number of VM's we have spawning/shutting down we are seeing serious problems. As I understand it ceph will do on demand recovery when a request is made for a degraded object ? Is is possible to make this recovery traffic go via a different network ? I was contemplating adding a 10gbe crossover between the servers to ensure this copy can happen super fast. If anyone has any suggestions on how to avoid this horrible I/O performance hit during recovery, let me know. Thanks Geraint
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com