[Single OSD performance on SSD] Can't go over 3, 2K IOPS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>Allegedly this model ssd (128G m550) can do 75K 4k random write IOPS 
>>(running fio on the filesystem I've seen 70K IOPS so is reasonably 
>>believable). So anyway we are not getting anywhere near the max IOPS 
>>from our devices. 

Hi,
Just check this:

http://www.anandtech.com/show/7864/crucial-m550-review-128gb-256gb-512gb-and-1tb-models-tested/3


If the ssd is full of datas, the performance is far from 75K, more around 7K.

I think only high-end DC ssd, slc, can provide consistent results around 40K-50K




	


----- Mail original ----- 

De: "Mark Kirkwood" <mark.kirkwood at catalyst.net.nz> 
?: "Sebastien Han" <sebastien.han at enovance.com>, "ceph-users" <ceph-users at lists.ceph.com> 
Envoy?: Lundi 1 Septembre 2014 02:36:45 
Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS 

On 31/08/14 17:55, Mark Kirkwood wrote: 
> On 29/08/14 22:17, Sebastien Han wrote: 
> 
>> @Mark thanks trying this :) 
>> Unfortunately using nobarrier and another dedicated SSD for the 
>> journal (plus your ceph setting) didn?t bring much, now I can reach 
>> 3,5K IOPS. 
>> By any chance, would it be possible for you to test with a single OSD 
>> SSD? 
>> 
> 
> Funny you should bring this up - I have just updated my home system with 
> a pair of Crucial m550. So figured I;d try a run with 2x ssd 1 for 
> journal and 1 for data and 1x ssd (journal + data). 
> 
> 
> The results were the opposite of what I expected (see below), with 2x 
> ssd getting about 6K IOPS and 1 x ssd getting 8K IOPS (wtf): 
> 
> I'm running this on Ubuntu 14.04 + ceph git master from a few days ago: 
> 
> $ ceph --version 
> ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752) 
> 
> The data partition was created with: 
> 
> $ sudo mkfs.xfs -f -l lazy-count=1 /dev/sdd4 
> 
> and mounted via: 
> 
> $ sudo mount -o nobarrier,allocsize=4096 /dev/sdd4 /ceph2 
> 
> 
> I've attached my ceph.conf and the fio template FWIW. 
> 
> 2x Crucial m550 (1x journal, 1x data) 
> 
> rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, 
> iodepth=64 
> fio-2.1.11-20-g9a44 
> Starting 1 process 
> rbd_thread: (groupid=0, jobs=1): err= 0: pid=5511: Sun Aug 31 17:33:40 2014 
> write: io=1024.0MB, bw=24694KB/s, iops=6173, runt= 42462msec 
> slat (usec): min=11, max=4086, avg=51.19, stdev=59.30 
> clat (msec): min=3, max=24, avg= 9.99, stdev= 1.57 
> lat (msec): min=3, max=24, avg=10.04, stdev= 1.57 
> clat percentiles (usec): 
> | 1.00th=[ 6624], 5.00th=[ 7584], 10.00th=[ 8032], 20.00th=[ 8640], 
> | 30.00th=[ 9152], 40.00th=[ 9536], 50.00th=[ 9920], 60.00th=[10304], 
> | 70.00th=[10816], 80.00th=[11328], 90.00th=[11968], 95.00th=[12480], 
> | 99.00th=[13888], 99.50th=[14528], 99.90th=[17024], 99.95th=[19584], 
> | 99.99th=[23168] 
> bw (KB /s): min=23158, max=25592, per=100.00%, avg=24711.65, 
> stdev=470.72 
> lat (msec) : 4=0.01%, 10=50.69%, 20=49.26%, 50=0.04% 
> cpu : usr=25.27%, sys=2.68%, ctx=266729, majf=0, minf=16773 
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.3%, 32=83.8%, 
> >=64=15.8% 
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0% 
> complete : 0=0.0%, 4=93.8%, 8=2.9%, 16=2.2%, 32=1.0%, 64=0.1%, 
> >=64=0.0% 
> issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 
> latency : target=0, window=0, percentile=100.00%, depth=64 
> 
> 1x Crucial m550 (journal + data) 
> 
> rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, 
> iodepth=64 
> fio-2.1.11-20-g9a44 
> Starting 1 process 
> rbd_thread: (groupid=0, jobs=1): err= 0: pid=6887: Sun Aug 31 17:42:22 2014 
> write: io=1024.0MB, bw=32778KB/s, iops=8194, runt= 31990msec 
> slat (usec): min=10, max=4016, avg=45.68, stdev=41.60 
> clat (usec): min=428, max=25688, avg=7658.03, stdev=1600.65 
> lat (usec): min=923, max=25757, avg=7703.72, stdev=1598.77 
> clat percentiles (usec): 
> | 1.00th=[ 3440], 5.00th=[ 5216], 10.00th=[ 6048], 20.00th=[ 6624], 
> | 30.00th=[ 7008], 40.00th=[ 7328], 50.00th=[ 7584], 60.00th=[ 7904], 
> | 70.00th=[ 8256], 80.00th=[ 8640], 90.00th=[ 9280], 95.00th=[10048], 
> | 99.00th=[12864], 99.50th=[14528], 99.90th=[17536], 99.95th=[19328], 
> | 99.99th=[21888] 
> bw (KB /s): min=30768, max=35160, per=100.00%, avg=32907.35, 
> stdev=934.80 
> lat (usec) : 500=0.01%, 1000=0.01% 
> lat (msec) : 2=0.04%, 4=1.80%, 10=93.15%, 20=4.97%, 50=0.04% 
> cpu : usr=32.32%, sys=3.05%, ctx=179657, majf=0, minf=16751 
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=59.7%, 
> >=64=40.0% 
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0% 
> complete : 0=0.0%, 4=96.8%, 8=2.6%, 16=0.5%, 32=0.1%, 64=0.1%, 
> >=64=0.0% 
> issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 
> latency : target=0, window=0, percentile=100.00%, depth=64 
> 
> 
> 
> 

I'm digging a bit more to try to understand this slightly surprising result. 

For that last benchmark I'd used a file rather than a device journal on 
the same ssd: 

$ ls -l /ceph2 
total 15360040 
-rw-r--r-- 1 root root 37 Sep 1 12:00 ceph_fsid 
drwxr-xr-x 68 root root 4096 Sep 1 12:00 current 
-rw-r--r-- 1 root root 37 Sep 1 12:00 fsid 
-rw-r--r-- 1 root root 15728640000 Sep 1 12:00 journal 
-rw------- 1 root root 56 Sep 1 12:00 keyring 
-rw-r--r-- 1 root root 21 Sep 1 12:00 magic 
-rw-r--r-- 1 root root 6 Sep 1 12:00 ready 
-rw-r--r-- 1 root root 4 Sep 1 12:00 store_version 
-rw-r--r-- 1 root root 53 Sep 1 12:00 superblock 
-rw-r--r-- 1 root root 2 Sep 1 12:00 whoami 


Let's try a more standard device journal on another partition of the 
same ssd. 1x Crucial m550 (device journal + data): 

$ ls -l /ceph2 
total 36 
-rw-r--r-- 1 root root 37 Sep 1 12:02 ceph_fsid 
drwxr-xr-x 68 root root 4096 Sep 1 12:02 current 
-rw-r--r-- 1 root root 37 Sep 1 12:02 fsid 
lrwxrwxrwx 1 root root 9 Sep 1 12:02 journal -> /dev/sdd1 
-rw------- 1 root root 56 Sep 1 12:02 keyring 
-rw-r--r-- 1 root root 21 Sep 1 12:02 magic 
-rw-r--r-- 1 root root 6 Sep 1 12:02 ready 
-rw-r--r-- 1 root root 4 Sep 1 12:02 store_version 
-rw-r--r-- 1 root root 53 Sep 1 12:02 superblock 
-rw-r--r-- 1 root root 2 Sep 1 12:02 whoami 


rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, 
iodepth=64 
fio-2.1.11-20-g9a44 
Starting 1 process 
rbd_thread: (groupid=0, jobs=1): err= 0: pid=4463: Mon Sep 1 09:16:16 2014 
write: io=1024.0MB, bw=22105KB/s, iops=5526, runt= 47436msec 
slat (usec): min=11, max=4054, avg=52.66, stdev=62.79 
clat (msec): min=3, max=43, avg=11.20, stdev= 1.69 
lat (msec): min=4, max=43, avg=11.25, stdev= 1.69 
clat percentiles (usec): 
| 1.00th=[ 7904], 5.00th=[ 8896], 10.00th=[ 9408], 20.00th=[10048], 
| 30.00th=[10432], 40.00th=[10688], 50.00th=[11072], 60.00th=[11456], 
| 70.00th=[11712], 80.00th=[12224], 90.00th=[12992], 95.00th=[13888], 
| 99.00th=[16768], 99.50th=[17792], 99.90th=[20352], 99.95th=[24960], 
| 99.99th=[42240] 
bw (KB /s): min=20285, max=23537, per=100.00%, avg=22126.98, 
stdev=579.19 
lat (msec) : 4=0.01%, 10=20.03%, 20=79.86%, 50=0.11% 
cpu : usr=23.48%, sys=2.58%, ctx=302278, majf=0, minf=16786 
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.6%, 32=82.8%, 
>=64=16.6% 
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0% 
complete : 0=0.0%, 4=93.9%, 8=3.0%, 16=2.0%, 32=1.0%, 64=0.1%, 
>=64=0.0% 
issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 
latency : target=0, window=0, percentile=100.00%, depth=64 

So we seem to lose performance a bit there. Finally let's use 2 ssd 
again but have a file journal only on the 2nd one. 2x Crucial m550 (1x 
file journal, 1x data): 

rbd_thread: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, 
iodepth=64 
Starting 1 process 
fio-2.1.11-20-g9a44 

rbd_thread: (groupid=0, jobs=1): err= 0: pid=6943: Mon Sep 1 11:18:01 2014 
write: io=1024.0MB, bw=32248KB/s, iops=8062, runt= 32516msec 
slat (usec): min=11, max=4843, avg=45.42, stdev=43.57 
clat (usec): min=657, max=22614, avg=7806.70, stdev=1319.02 
lat (msec): min=1, max=22, avg= 7.85, stdev= 1.32 
clat percentiles (usec): 
| 1.00th=[ 4384], 5.00th=[ 5984], 10.00th=[ 6432], 20.00th=[ 6880], 
| 30.00th=[ 7200], 40.00th=[ 7520], 50.00th=[ 7776], 60.00th=[ 8032], 
| 70.00th=[ 8384], 80.00th=[ 8640], 90.00th=[ 9152], 95.00th=[ 9664], 
| 99.00th=[11328], 99.50th=[13376], 99.90th=[17536], 99.95th=[18304], 
| 99.99th=[21376] 
bw (KB /s): min=30408, max=35320, per=100.00%, avg=32339.56, 
stdev=937.80 
lat (usec) : 750=0.01% 
lat (msec) : 2=0.03%, 4=0.70%, 10=95.96%, 20=3.29%, 50=0.02% 
cpu : usr=31.37%, sys=3.42%, ctx=181872, majf=0, minf=16759 
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=56.6%, 
>=64=43.3% 
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0% 
complete : 0=0.0%, 4=97.1%, 8=2.4%, 16=0.4%, 32=0.1%, 64=0.1%, 
>=64=0.0% 
issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0 
latency : target=0, window=0, percentile=100.00%, depth=64 

So we are up to 8K IOPS again. Observe we are not maxing out the ssds: 

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s 
avgrq-sz avgqu-sz await r_await w_await svctm %util 
sda 0.00 0.00 0.00 0.00 0.00 0.00 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 
sdb 0.00 0.00 0.00 0.00 0.00 0.00 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 
sdd 0.00 5048.00 0.00 7550.00 0.00 83.43 
22.63 2.80 0.37 0.00 0.37 0.04 31.60 
sdc 0.00 0.00 0.00 7145.00 0.00 72.21 
20.70 0.27 0.04 0.00 0.04 0.04 26.80 

Allegedly this model ssd (128G m550) can do 75K 4k random write IOPS 
(running fio on the filesystem I've seen 70K IOPS so is reasonably 
believable). So anyway we are not getting anywhere near the max IOPS 
from our devices. 

We use the Intel S3700 for production ceph servers, so I'll see if we 
have any I can test on - would be interesting to see if I find the same 
3.5K issue or not. 

Cheers 

Mark 


_______________________________________________ 
ceph-users mailing list 
ceph-users at lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux