Hi, I wouldn't put those SSD's in raid, just use them separately as journals for half of your's HDD's. This should make your write performance somewhat better. W dniu 04.07.2014 o 11:13 Marco Allevato <m.allevato at nwe.de> pisze: > > Hello Ceph-Community, > > > I?m writing here because we have a bad write-performance on our > Ceph-Cluster of about > > As an overview the technical details of our Cluster: > > > 3 x monitoring-Servers; each with 2 x 1 Gbit/s NIC configured as Bond > (Link Aggregation-Mode) > > > 5 x datastore-Servers; each with 10 x 4 TB HDDs serving as OSDs, as > Journal we use a 15 GB LVM on an 256 GB SSD-Raid1; 2 x 10 Gbit/s NIC > configured as Bond (Link Aggregation->Mode) > > > ceph.conf > > > [global] > > auth_service_required = cephx > > filestore_xattr_use_omap = true > > auth_client_required = cephx > > auth_cluster_required = cephx > > mon_host = 172.30.30.8,172.30.30.9 > > mon_initial_members = monitoring1, monitoring2, monitoring3 > > fsid = 5f22ab94-8d96-48c2-88d3-cff7bad443a9 > > public network = 172.30.30.0/24 > > [mon.monitoring1] > > host = monitoring1 > > addr = 172.30.30.8:6789 > > > [mon.monitoring2] > > host = monitoring2 > > addr = 172.30.30.9:6789 > > > [mon.monitoring3] > > host = monitoring3 > > addr = 172.30.30.10:6789 > > > [filestore] > > filestore max sync interval = 10 > > > [osd] > > osd recovery max active = 1 > > osd journal size = 15360 > > osd op threads = 40 > > osd disk threads = 40 > > > [osd.0] > > host = datastore1 > > > [osd.1] > > host = datastore1 > > > [osd.2] > > host = datastore1 > > > [osd.3] > > host = datastore1 > > > [osd.4] > > host = datastore1 > > > [osd.5] > > host = datastore1 > > > [osd.6] > > host = datastore1 > > > [osd.7] > > host = datastore1 > > > [osd.8] > > host = datastore1 > > > [osd.9] > > host = datastore1 > > > [osd.10] > > host = datastore2 > > > [osd.11] > > host = datastore2 > > > [osd.11] > > host = datastore2 > > > [osd.12] > > host = datastore2 > > > [osd.13] > > host = datastore2 > > > [osd.14] > > host = datastore2 > > > [osd.15] > > host = datastore2 > > > [osd.16] > > host = datastore2 > > > [osd.17] > > host = datastore2 > > > [osd.18] > > host = datastore2 > > > [osd.19] > > host = datastore2 > > > [osd.20] > > host = datastore3 > > > [osd.21] > > host = datastore3 > > > [osd.22] > > host = datastore3 > > > [osd.23] > > host = datastore3 > > > [osd.24] > > host = datastore3 > > > [osd.25] > > host = datastore3 > > > [osd.26] > > host = datastore3 > > > [osd.27] > > host = datastore3 > > > [osd.28] > > host = datastore3 > > > [osd.29] > > host = datastore3 > > > [osd.30] > > host = datastore4 > > > [osd.31] > > host = datastore4 > > > [osd.32] > > host = datastore4 > > > [osd.33] > > host = datastore4 > > > [osd.34] > > host = datastore4 > > > [osd.35] > > host = datastore4 > > > [osd.36] > > host = datastore4 > > > [osd.37] > > host = datastore4 > > > [osd.38] > > host = datastore4 > > > [osd.39] > > host = datastore4 > > > [osd.0] > > host = datastore5 > > > [osd.40] > > host = datastore5 > > > [osd.41] > > host = datastore5 > > > [osd.42] > > host = datastore5 > > > [osd.43] > > host = datastore5 > > > [osd.44] > > host = datastore5 > > > [osd.45] > > host = datastore5 > > > [osd.46] > > host = datastore5 > > > [osd.47] > > host = datastore5 > > > [osd.48] > > host = datastore5 > > > > We have 3 pools: > > -> 2 x 1000 pgs with 2 Replicas distributing the data equally to two > racks (Used for datastore 1-4) > > -> 1 x 100 pgs without replication; data only stored on datastore 5. > This Pool is used to compare the performance on local disks without > networking > > > > Here are the performance values, which I get using fio-Bench on a 32GB > rbd: > > > > On 1000 pgs-Pool with distribution > > > fio --bs=1M --rw=randwrite --ioengine=libaio --direct=1 --iodepth=32 > --runtime=60 --name=/dev/rbd/pool1/bench1 > > > fio-2.0.13 > > Starting 1 process > > Jobs: 1 (f=1): [w] [100.0% done] [0K/312.0M/0K /s] [0 /312 /0 iops] > [eta 00m:00s] > > /dev/rbd/pool1/bench1: (groupid=0, jobs=1): err= 0: pid=21675: Fri Jul > 4 11:03:52 2014 > > write: io=21071MB, bw=358989KB/s, iops=350 , runt= 60104msec > > slat (usec): min=127 , max=8040 , avg=511.49, stdev=216.27 > > clat (msec): min=5 , max=4018 , avg=90.74, stdev=215.83 > > lat (msec): min=6 , max=4018 , avg=91.25, stdev=215.83 > > clat percentiles (msec): > > | 1.00th=[ 8], 5.00th=[ 9], 10.00th=[ 11], 20.00th=[ 15], > > | 30.00th=[ 21], 40.00th=[ 30], 50.00th=[ 45], 60.00th=[ 63], > > | 70.00th=[ 83], 80.00th=[ 105], 90.00th=[ 129], 95.00th=[ 190], > > | 99.00th=[ 1254], 99.50th=[ 1680], 99.90th=[ 2409], 99.95th=[ 2638], > > | 99.99th=[ 3556] > > bw (KB/s) : min=68210, max=479232, per=100.00%, avg=368399.55, > stdev=84457.12 > > lat (msec) : 10=9.50%, 20=20.02%, 50=23.56%, 100=24.56%, 250=18.09% > > lat (msec) : 500=1.39%, 750=0.81%, 1000=0.65%, 2000=1.13%, > >=2000=0.29% > > cpu : usr=11.17%, sys=7.46%, ctx=17772, majf=0, minf=24 > > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%, > >=64=0.0% > > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, > >=64=0.0% > > issued : total=r=0/w=21071/d=0, short=r=0/w=0/d=0 > > > Run status group 0 (all jobs): > > WRITE: io=21071MB, aggrb=358989KB/s, minb=358989KB/s, maxb=358989KB/s, > mint=60104msec, maxt=60104msec > > > > On 100 pgs-Pool without distribution: > > > WRITE: io=5884.0MB, aggrb=297953KB/s, minb=297953KB/s, maxb=297953KB/s, > mint=20222msec, maxt=20222msec > > > > Do you have any suggestion on how to improve the performace? > > While Reading on the internet, typical write-rates should be around > 800-1000 Mb/sec if using 10 Gbit/s-Connection with a similar setup. > > > > Thanks in advance > > > -- > > Marco Allevato > Projektteam > > > Network Engineering GmbH > Maximilianstrasse 93 > D-67346 Speyer > > > -- Konrad Gutkowski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140704/e4b12c97/attachment.htm>