Yes crucial is not suitable for this. Of you write sequential data like the journal for around 1-2 hours the speed goes down to 80mb/s. Also it has very low performance in sync / flush mode which the journal is using. Stefan Excuse my typo sent from my mobile phone. Am 01.09.2014 um 07:10 schrieb Alexandre DERUMIER <aderumier at odiso.com>: >>> Allegedly this model ssd (128G m550) can do 75K 4k random write IOPS >>> (running fio on the filesystem I've seen 70K IOPS so is reasonably >>> believable). So anyway we are not getting anywhere near the max IOPS >>> from our devices. > > Hi, > Just check this: > > http://www.anandtech.com/show/7864/crucial-m550-review-128gb-256gb-512gb-and-1tb-models-tested/3 > > > If the ssd is full of datas, the performance is far from 75K, more around 7K. > > I think only high-end DC ssd, slc, can provide consistent results around 40K-50K > > > > > > > > ----- Mail original ----- > > De: "Mark Kirkwood" <mark.kirkwood at catalyst.net.nz> > ?: "Sebastien Han" <sebastien.han at enovance.com>, "ceph-users" <ceph-users at lists.ceph.com> > Envoy?: Lundi 1 Septembre 2014 02:36:45 > Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS > >> On 31/08/14 17:55, Mark Kirkwood wrote: >>> On 29/08/14 22:17, Sebastien Han wrote: >>> >>> @Mark thanks trying this :) >>> Unfortunately using nobarrier and another dedicated SSD for the >>> journal (plus your ceph setting) didn?t bring much, now I can reach >>> 3,5K IOPS. >>> By any chance, would it be possible for you to test with a single OSD >>> SSD? >> >> Funny you should bring this up - I have just updated my home system with >> a pair of Crucial m550. So figured I;d try a run with 2x ssd 1 for >> journal and 1 for data and 1x ssd (journal + data). >> >> >> The results were the opposite of what I expected (see below), with 2x >> ssd getting about 6K IOPS and 1 x ssd getting 8K IOPS (wtf): >> >> I'm running this on Ubuntu 14.04 + ceph git master from a few days ago: >> >> $ ceph --version >> ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752) >> >> The data partition was created with: >> >> $ sudo mkfs.xfs -f -l lazy-count=1 /dev/sdd4 >> >> and mounted via: >> >> $ sudo mount -o nobarrier,allocsize=4096 /dev/sdd4 /ceph2 >> >> >> I've attached my ceph.conf and the fio template FWIW. >> >> 2x Crucial m550 (1x journal, 1x data) >> >> rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, >> iodepth=64 >> fio-2.1.11-20-g9a44 >> Starting 1 process >> rbd_thread: (groupid=0, jobs=1): err= 0: pid=5511: Sun Aug 31 17:33:40 2014 >> write: io=1024.0MB, bw=24694KB/s, iops=6173, runt= 42462msec >> slat (usec): min=11, max=4086, avg=51.19, stdev=59.30 >> clat (msec): min=3, max=24, avg= 9.99, stdev= 1.57 >> lat (msec): min=3, max=24, avg=10.04, stdev= 1.57 >> clat percentiles (usec): >> | 1.00th=[ 6624], 5.00th=[ 7584], 10.00th=[ 8032], 20.00th=[ 8640], >> | 30.00th=[ 9152], 40.00th=[ 9536], 50.00th=[ 9920], 60.00th=[10304], >> | 70.00th=[10816], 80.00th=[11328], 90.00th=[11968], 95.00th=[12480], >> | 99.00th=[13888], 99.50th=[14528], 99.90th=[17024], 99.95th=[19584], >> | 99.99th=[23168] >> bw (KB /s): min=23158, max=25592, per=100.00%, avg=24711.65, >> stdev=470.72 >> lat (msec) : 4=0.01%, 10=50.69%, 20=49.26%, 50=0.04% >> cpu : usr=25.27%, sys=2.68%, ctx=266729, majf=0, minf=16773 >> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=83.8%, >>> =64=15.8% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>> =64=0.0% >> complete : 0=0.0%, 4=93.8%, 8=2.9%, 16=2.2%, 32=1.0%, 64=0.1%, >>> =64=0.0% >> issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=64 >> >> 1x Crucial m550 (journal + data) >> >> rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, >> iodepth=64 >> fio-2.1.11-20-g9a44 >> Starting 1 process >> rbd_thread: (groupid=0, jobs=1): err= 0: pid=6887: Sun Aug 31 17:42:22 2014 >> write: io=1024.0MB, bw=32778KB/s, iops=8194, runt= 31990msec >> slat (usec): min=10, max=4016, avg=45.68, stdev=41.60 >> clat (usec): min=428, max=25688, avg=7658.03, stdev=1600.65 >> lat (usec): min=923, max=25757, avg=7703.72, stdev=1598.77 >> clat percentiles (usec): >> | 1.00th=[ 3440], 5.00th=[ 5216], 10.00th=[ 6048], 20.00th=[ 6624], >> | 30.00th=[ 7008], 40.00th=[ 7328], 50.00th=[ 7584], 60.00th=[ 7904], >> | 70.00th=[ 8256], 80.00th=[ 8640], 90.00th=[ 9280], 95.00th=[10048], >> | 99.00th=[12864], 99.50th=[14528], 99.90th=[17536], 99.95th=[19328], >> | 99.99th=[21888] >> bw (KB /s): min=30768, max=35160, per=100.00%, avg=32907.35, >> stdev=934.80 >> lat (usec) : 500=0.01%, 1000=0.01% >> lat (msec) : 2=0.04%, 4=1.80%, 10=93.15%, 20=4.97%, 50=0.04% >> cpu : usr=32.32%, sys=3.05%, ctx=179657, majf=0, minf=16751 >> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=59.7%, >>> =64=40.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>> =64=0.0% >> complete : 0=0.0%, 4=96.8%, 8=2.6%, 16=0.5%, 32=0.1%, 64=0.1%, >>> =64=0.0% >> issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=64 > > I'm digging a bit more to try to understand this slightly surprising result. > > For that last benchmark I'd used a file rather than a device journal on > the same ssd: > > $ ls -l /ceph2 > total 15360040 > -rw-r--r-- 1 root root 37 Sep 1 12:00 ceph_fsid > drwxr-xr-x 68 root root 4096 Sep 1 12:00 current > -rw-r--r-- 1 root root 37 Sep 1 12:00 fsid > -rw-r--r-- 1 root root 15728640000 Sep 1 12:00 journal > -rw------- 1 root root 56 Sep 1 12:00 keyring > -rw-r--r-- 1 root root 21 Sep 1 12:00 magic > -rw-r--r-- 1 root root 6 Sep 1 12:00 ready > -rw-r--r-- 1 root root 4 Sep 1 12:00 store_version > -rw-r--r-- 1 root root 53 Sep 1 12:00 superblock > -rw-r--r-- 1 root root 2 Sep 1 12:00 whoami > > > Let's try a more standard device journal on another partition of the > same ssd. 1x Crucial m550 (device journal + data): > > $ ls -l /ceph2 > total 36 > -rw-r--r-- 1 root root 37 Sep 1 12:02 ceph_fsid > drwxr-xr-x 68 root root 4096 Sep 1 12:02 current > -rw-r--r-- 1 root root 37 Sep 1 12:02 fsid > lrwxrwxrwx 1 root root 9 Sep 1 12:02 journal -> /dev/sdd1 > -rw------- 1 root root 56 Sep 1 12:02 keyring > -rw-r--r-- 1 root root 21 Sep 1 12:02 magic > -rw-r--r-- 1 root root 6 Sep 1 12:02 ready > -rw-r--r-- 1 root root 4 Sep 1 12:02 store_version > -rw-r--r-- 1 root root 53 Sep 1 12:02 superblock > -rw-r--r-- 1 root root 2 Sep 1 12:02 whoami > > > rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, > iodepth=64 > fio-2.1.11-20-g9a44 > Starting 1 process > rbd_thread: (groupid=0, jobs=1): err= 0: pid=4463: Mon Sep 1 09:16:16 2014 > write: io=1024.0MB, bw=22105KB/s, iops=5526, runt= 47436msec > slat (usec): min=11, max=4054, avg=52.66, stdev=62.79 > clat (msec): min=3, max=43, avg=11.20, stdev= 1.69 > lat (msec): min=4, max=43, avg=11.25, stdev= 1.69 > clat percentiles (usec): > | 1.00th=[ 7904], 5.00th=[ 8896], 10.00th=[ 9408], 20.00th=[10048], > | 30.00th=[10432], 40.00th=[10688], 50.00th=[11072], 60.00th=[11456], > | 70.00th=[11712], 80.00th=[12224], 90.00th=[12992], 95.00th=[13888], > | 99.00th=[16768], 99.50th=[17792], 99.90th=[20352], 99.95th=[24960], > | 99.99th=[42240] > bw (KB /s): min=20285, max=23537, per=100.00%, avg=22126.98, > stdev=579.19 > lat (msec) : 4=0.01%, 10=20.03%, 20=79.86%, 50=0.11% > cpu : usr=23.48%, sys=2.58%, ctx=302278, majf=0, minf=16786 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.6%, 32=82.8%, >> =64=16.6% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> =64=0.0% > complete : 0=0.0%, 4=93.9%, 8=3.0%, 16=2.0%, 32=1.0%, 64=0.1%, >> =64=0.0% > issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=64 > > So we seem to lose performance a bit there. Finally let's use 2 ssd > again but have a file journal only on the 2nd one. 2x Crucial m550 (1x > file journal, 1x data): > > rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, > iodepth=64 > Starting 1 process > fio-2.1.11-20-g9a44 > > rbd_thread: (groupid=0, jobs=1): err= 0: pid=6943: Mon Sep 1 11:18:01 2014 > write: io=1024.0MB, bw=32248KB/s, iops=8062, runt= 32516msec > slat (usec): min=11, max=4843, avg=45.42, stdev=43.57 > clat (usec): min=657, max=22614, avg=7806.70, stdev=1319.02 > lat (msec): min=1, max=22, avg= 7.85, stdev= 1.32 > clat percentiles (usec): > | 1.00th=[ 4384], 5.00th=[ 5984], 10.00th=[ 6432], 20.00th=[ 6880], > | 30.00th=[ 7200], 40.00th=[ 7520], 50.00th=[ 7776], 60.00th=[ 8032], > | 70.00th=[ 8384], 80.00th=[ 8640], 90.00th=[ 9152], 95.00th=[ 9664], > | 99.00th=[11328], 99.50th=[13376], 99.90th=[17536], 99.95th=[18304], > | 99.99th=[21376] > bw (KB /s): min=30408, max=35320, per=100.00%, avg=32339.56, > stdev=937.80 > lat (usec) : 750=0.01% > lat (msec) : 2=0.03%, 4=0.70%, 10=95.96%, 20=3.29%, 50=0.02% > cpu : usr=31.37%, sys=3.42%, ctx=181872, majf=0, minf=16759 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=56.6%, >> =64=43.3% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >> =64=0.0% > complete : 0=0.0%, 4=97.1%, 8=2.4%, 16=0.4%, 32=0.1%, 64=0.1%, >> =64=0.0% > issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=64 > > So we are up to 8K IOPS again. Observe we are not maxing out the ssds: > > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > sdb 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > sdd 0.00 5048.00 0.00 7550.00 0.00 83.43 > 22.63 2.80 0.37 0.00 0.37 0.04 31.60 > sdc 0.00 0.00 0.00 7145.00 0.00 72.21 > 20.70 0.27 0.04 0.00 0.04 0.04 26.80 > > Allegedly this model ssd (128G m550) can do 75K 4k random write IOPS > (running fio on the filesystem I've seen 70K IOPS so is reasonably > believable). So anyway we are not getting anywhere near the max IOPS > from our devices. > > We use the Intel S3700 for production ceph servers, so I'll see if we > have any I can test on - would be interesting to see if I find the same > 3.5K issue or not. > > Cheers > > Mark > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140901/26d09b46/attachment.htm>