200 iops is close to the sync write latency you will get with either slow CPU's or 1GB networking. What sort of hardware/networking are you running? With top of the range hardware and a replica count of 2-3, don't expect to get much above 500-750iops for a single direct write. > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Wido den Hollander > Sent: 12 February 2016 09:18 > To: ceph-users@xxxxxxxx; Ferhat Ozkasgarli <ozkasgarli@xxxxxxxxx> > Subject: Re: ceph 9.2.0 SAMSUNG ssd performance issue? > > > > Op 12 februari 2016 om 10:14 schreef Ferhat Ozkasgarli > <ozkasgarli@xxxxxxxxx>: > > > > > > Hello Huan, > > > > If you look at Sebestien blog ( > > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your- > > ssd-is-suitable-as-a-journal-device/) > > at comment section. You can see that Samsung SSD behaves very and very > > poorly on tests: > > > > Samsung SSD 850 PRO 256GB > > 409600000 bytes (410 MB) copied, 274.162 s, 1.5 MB/s > > > > INTEL 535 SSDSC2BW240H6 240GB > > 409600000 bytes (410 MB) copied, 1022.64 s, 401 kB/s > > > > The SSD used (MZ7KM1T9) is a Samsung SM836 which is a write-intensive > SSD from Samsung. Seem to perform pretty good. > > Wido > > > > > If you have similar disks, you should not use it in Ceph. > > > > On Fri, Feb 12, 2016 at 10:14 AM, Huan Zhang <huan.zhang.jn@xxxxxxxxx> > > wrote: > > > > > Thanks for reply! > > > not very good, but seems acceptable, how do you think the possible > > > reasons? osd erf counters helpful for this? > > > > > > > > > sudo fio --filename=/dev/sda2 --direct=1 --sync=1 --rw=write --bs=4k > > > --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting > > > --name=journal-test > > > > > > journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, > > > iodepth=1 > > > > > > fio-2.1.11 > > > > > > Starting 1 process > > > > > > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/49928KB/0KB /s] [0/12.5K/0 > > > iops] [eta 00m:00s] > > > > > > journal-test: (groupid=0, jobs=1): err= 0: pid=247168: Fri Feb 12 > > > 16:08:12 > > > 2016 > > > > > > write: io=2944.1MB, bw=50259KB/s, iops=12564, runt= 60001msec > > > > > > clat (usec): min=43, max=1503, avg=77.47, stdev=17.37 > > > > > > lat (usec): min=43, max=1503, avg=77.75, stdev=17.42 > > > > > > clat percentiles (usec): > > > > > > | 1.00th=[ 47], 5.00th=[ 50], 10.00th=[ 54], 20.00th=[ 63], > > > > > > | 30.00th=[ 67], 40.00th=[ 73], 50.00th=[ 76], 60.00th=[ 79], > > > > > > | 70.00th=[ 86], 80.00th=[ 91], 90.00th=[ 100], 95.00th=[ 105], > > > > > > | 99.00th=[ 122], 99.50th=[ 129], 99.90th=[ 147], 99.95th=[ > > > 155], > > > > > > | 99.99th=[ 167] > > > > > > bw (KB /s): min=44200, max=57680, per=100.00%, avg=50274.42, > > > stdev=2662.04 > > > > > > lat (usec) : 50=4.64%, 100=84.85%, 250=10.51%, 500=0.01%, > > > 750=0.01% > > > > > > lat (msec) : 2=0.01% > > > > > > cpu : usr=6.34%, sys=32.72%, ctx=1507971, majf=0, minf=98 > > > > > > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > > > >=64=0.0% > > > > > > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > > > >=64=0.0% > > > > > > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, > > > 64=0.0%, > > > >=64=0.0% > > > > > > issued : total=r=0/w=753897/d=0, short=r=0/w=0/d=0 > > > > > > latency : target=0, window=0, percentile=100.00%, depth=1 > > > > > > > > > Run status group 0 (all jobs): > > > > > > WRITE: io=2944.1MB, aggrb=50258KB/s, minb=50258KB/s, > > > maxb=50258KB/s, mint=60001msec, maxt=60001msec > > > > > > > > > Disk stats (read/write): > > > > > > sda: ios=0/1506216, merge=0/0, ticks=0/39449, in_queue=39025, > > > util=65.04% > > > > > > > > > 2016-02-12 15:41 GMT+08:00 Huan Zhang <huan.zhang.jn@xxxxxxxxx>: > > > > > >> Hi, > > >> > > >> ceph VERY SLOW with 24 osd(SAMSUNG ssd). > > >> fio /dev/rbd0 iodepth=1 direct=1 IOPS only ~200 > > >> fio /dev/rbd0 iodepth=32 direct=1 IOPS only ~3000 > > >> > > >> But test single ssd deive with fio: > > >> fio iodepth=1 direct=1 IOPS ~15000 > > >> fio iodepth=32 direct=1 IOPS ~30000 > > >> > > >> Why ceph SO SLOW? Could you give me some help? > > >> Appreciated! > > >> > > >> > > >> My Enviroment: > > >> [root@szcrh-controller ~]# ceph -s > > >> cluster eb26a8b9-e937-4e56-a273-7166ffaa832e > > >> health HEALTH_WARN > > >> 1 mons down, quorum 0,1,2,3,4 > > >> ceph01,ceph02,ceph03,ceph04, > > >> ceph05 > > >> monmap e1: 6 mons at {ceph01= > > >> > > >> 10.10.204.144:6789/0,ceph02=10.10.204.145:6789/0,ceph03=10.10.204.1 > > >> 46:6789/0,ceph04=10.10.204.147:6789/0,ceph05=10.10.204.148:6789/0,c > > >> eph06=0.0.0.0:0/5 > > >> } > > >> election epoch 6, quorum 0,1,2,3,4 > > >> ceph01,ceph02,ceph03,ceph04,ceph05 > > >> osdmap e114: 24 osds: 24 up, 24 in > > >> flags sortbitwise > > >> pgmap v2213: 1864 pgs, 3 pools, 49181 MB data, 4485 objects > > >> 144 GB used, 42638 GB / 42782 GB avail > > >> 1864 active+clean > > >> > > >> [root@ceph03 ~]# lsscsi > > >> [0:0:6:0] disk ATA SAMSUNG MZ7KM1T9 003Q /dev/sda > > >> [0:0:7:0] disk ATA SAMSUNG MZ7KM1T9 003Q /dev/sdb > > >> [0:0:8:0] disk ATA SAMSUNG MZ7KM1T9 003Q /dev/sdc > > >> [0:0:9:0] disk ATA SAMSUNG MZ7KM1T9 003Q /dev/sdd > > >> > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com