Re: OSD to OSD Communication

Wido den Hollander <wido@xxxxxxxx> · Fri, 30 Aug 2013 20:33:56 +0200

On 08/30/2013 08:19 PM, Geraint Jones wrote:
Hi Guys

We are using Ceph in production backing an LXC cluster. The setup is : 2
x Servers, 24 x 3TB Disks each in groups of 3 as RAID0. SSD for
journals. Bonded 1gbit ethernet (2gbit total).

I think you sized your machines too big. I'd say go for 6 machines with 
8 disks each without RAID-0. Let Ceph do it's job and avoid RAID.

Such big sized machines only work in a very large cluster.

Overnight we have had a disk failure, this in itself is not a biggie –
but due to the number of VM's we have spawning/shutting down we are
seeing serious problems.

Are you using CephFS? I think so with LXC?

As I understand it ceph will do on demand recovery when a request is
made for a degraded object ? Is is possible to make this recovery
traffic go via a different network ? I was contemplating adding a 10gbe
crossover between the servers to ensure this copy can happen super fast.

Yes, you can use "cluster_network" to direct OSD traffic over different 
network interfaces.

Wido

If anyone has any suggestions on how to avoid this horrible I/O
performance hit during recovery, let me know.

Thanks

Geraint

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com