OSD to OSD Communication

Geraint Jones <geraint@xxxxxxxxxx> · Fri, 30 Aug 2013 11:19:29 -0700

Hi Guys

We are using Ceph in production backing an LXC cluster. The setup is : 2 x Servers, 24 x 3TB Disks each in groups of 3 as RAID0. SSD for journals. Bonded 1gbit ethernet (2gbit total). 

Overnight we have had a disk failure, this in itself is not a biggie – but due to the number of VM's we have spawning/shutting down we are seeing serious problems.

As I understand it ceph will do on demand recovery when a request is made for a degraded object ? Is is possible to make this recovery traffic go via a different network ? I was contemplating adding a 10gbe crossover between the servers to ensure this copy can happen super fast.

If anyone has any suggestions on how to avoid this horrible I/O performance hit during recovery, let me know.

Thanks

Geraint
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com