Re: Poor performance with three nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/2/2013 2:24 PM, Gregory Farnum wrote:
There's a couple things here:
1) You aren't accounting for Ceph's journaling. Unlike a system such
as NFS, Ceph provides *very* strong data integrity guarantees under
failure conditions, and in order to do so it does full data
journaling. So, yes, cut your total disk bandwidth in half. (There's
also a lot of syncing which it manages carefully to reduce the cost,
but if you had other writes happening via your NFS/iSCSI setups that
might have been hit by the OSD running a sync on its disk, that could
be dramatically impacting the perceived throughput.)

I was running iostat on the storage servers to see what was happening during the quicky test, and was not seeing large amounts of I/O taking place, whether created by Ceph or not. From what you are saying I should have seen roughly 120 megabytes per second being written on at least one of the servers as the journal hit the server. I was seeing roughly 30 megabytes per second on each server.

2) Placing an OSD (with its journal) on a RAID-6 is about the worst
thing you can do for Ceph's performance; it does a lot of small
flushed-to-disk IOs in the journal in between the full data writes.
Try some other configuration?

I don't have any other configuration. These servers are in production. They cannot be taken down. I don't have any more servers with 10 gigabit Ethernet cards (which are *not* cheap). I'm starting to suspect that Ceph simply is not usable in my environment, which is a mixed-mode shared environment rather than something that can be dedicated to any single storage protocol. Toob ad.



3) Did you explicitly set your PG counts at any point? They default to
8, which is entirely too low; given your setup you should have
400-1000 per pool.

They defaulted to 64, and according to the calculation in the documentation should be 500/3 = 166 per pool for my configuration. Still, that does not appear to be the issue here considering that I've created one block device and that's the only Ceph traffic. I just raised it to 166 for the data pool, no difference.

4) There could have been something wrong/going on with the system;
though I doubt it. But if you can provide the output of "ceph -s"
that'll let us check the basics.

Everything looks healthy.

[root@stack1 ~]# ceph -s
  cluster 26206dba-e976-4217-a3d4-c9ea02c188be
   health HEALTH_OK
monmap e2: 3 mons at {stack1=10.200.0.3:6789/0,storage1=10.200.0.1:6789/0,storage2=10.200.0.2:6789/0}, election epoch 112, quorum 0,1,2 stack1,storage1,storage2
   osdmap e793: 5 osds: 5 up, 5 in
pgmap v3504: 295 pgs: 295 active+clean; 21264 MB data, 30154 MB used, 10205 GB / 10235 GB avail
   mdsmap e25: 1/1/1 up {0=stack1=up:active}



Separately, if all you want is to ensure that data resides on at least
two servers, there are better ways than saying "each server has two
daemons, so I'll do 3-copy". See eg

I was as concerned about performance as I was about redundancy when I set it to three copies. I saw the crush map rules but for my purposes simply having three replicas was sufficient.


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux