Le vendredi 12 avril 2013 à 10:04 -0500, Mark Nelson a écrit : > On 04/11/2013 07:25 PM, Ziemowit Pierzycki wrote: > > No, I'm not using RDMA in this configuration since this will eventually > > get deployed to production with 10G ethernet (yes RDMA is faster). I > > would prefer Ceph because it has a storage drive built into OpenNebula > > which my company is using and as you mentioned individual drives. > > > > I'm not sure what the problem is but it appears to me that one of the > > hosts may be holding up the rest... with Ceph if the performance of one > > of the hosts is much faster than others could this potentially slow down > > the cluster to this level? > > Definitely! Even 1 slow OSD can cause dramatic slow downs. This is > because we (by default) try to distribute data evenly to every OSD in > the cluster. If even 1 OSD is really slow, it will accumulate more and > more outstanding operations while all of the other OSDs complete their > requests. What will happen is that eventually you will have all of your > outstanding operations waiting on that slow OSD, and all of the other > OSDs will sit idle waiting for new requests. > > If you know that some OSDs are permanently slower than others, you can > re-weight them so that they receive fewer requests than the others which > can mitigate this, but that isn't always an optimal solution. Some > times a slow OSD can be a sign of other hardware problems too. > > Mark > and does response time of OSD are log somewhere, to identify that "weak link" ? _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com