This probably muddies the water. Note Active cluster with around 22 read/write IOPS and 200kB read/write A CephFS mounted with 3 hosts 6 osd per host with 8G public and 10G private networking for Ceph. No SSDs and mostly WD Red 1T 2.5" drives some are HGST 1T 7200. root@blade7:~# fio -ioengine=libaio -name=test -bs=4k -iodepth=32 -rw=randwrite -direct=1 -runtime=60 -filename=/mnt/pve/cephfs/test test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32 fio-3.12 Starting 1 process test: you need to specify size= fio: pid=0, err=22/file:filesetup.c:952, func=total_file_size, error=Invalid argument Run status group 0 (all jobs): root@blade7:~# fio -ioengine=libaio -name=test -bs=4k -iodepth=32 -rw=randwrite -direct=1 -runtime=60 -size=10G -filename=/mnt/pve/cephfs/test test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32 fio-3.12 Starting 1 process test: Laying out IO file (1 file / 10240MiB) Jobs: 1 (f=0): [f(1)][100.0%][w=580KiB/s][w=145 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=3561674: Sat Aug 17 09:20:22 2019 write: IOPS=2262, BW=9051KiB/s (9268kB/s)(538MiB/60845msec); 0 zone resets slat (usec): min=8, max=35648, avg=40.01, stdev=97.51 clat (usec): min=954, max=2854.3k, avg=14090.15, stdev=100194.83 lat (usec): min=994, max=2854.3k, avg=14130.65, stdev=100195.40 clat percentiles (usec): | 1.00th=[ 1254], 5.00th=[ 1450], 10.00th=[ 1582], | 20.00th=[ 1795], 30.00th=[ 2008], 40.00th=[ 2245], | 50.00th=[ 2540], 60.00th=[ 2933], 70.00th=[ 3392], | 80.00th=[ 4228], 90.00th=[ 7767], 95.00th=[ 35914], | 99.00th=[ 254804], 99.50th=[ 616563], 99.90th=[1652556], | 99.95th=[2122318], 99.99th=[2600469] bw ( KiB/s): min= 48, max=44408, per=100.00%, avg=10387.54, stdev=10384.94, samples=106 iops : min= 12, max=11102, avg=2596.88, stdev=2596.23, samples=106 lat (usec) : 1000=0.01% lat (msec) : 2=29.82%, 4=47.95%, 10=14.23%, 20=2.43%, 50=1.34% lat (msec) : 100=2.68%, 250=0.53%, 500=0.40%, 750=0.20%, 1000=0.14% cpu : usr=1.45%, sys=6.36%, ctx=151946, majf=0, minf=280 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued rwts: total=0,137674,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: bw=9051KiB/s (9268kB/s), 9051KiB/s-9051KiB/s (9268kB/s-9268kB/s), io=538MiB (564MB), run=60845-60845msec This is on the same system with a RBD mapped file system root@blade7:/mnt# fio -ioengine=libaio -name=test -bs=4k -iodepth=32 -rw=randwrite -direct=1 -runtime=60 -size=10G -filename=/mnt/image0/test test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32 fio-3.12 Starting 1 process test: Laying out IO file (1 file / 10240MiB) Jobs: 1 (f=1): [w(1)][4.5%][w=4KiB/s][w=1 IOPS][eta 21m:30s] test: (groupid=0, jobs=1): err= 0: pid=3567399: Sat Aug 17 09:38:55 2019 write: IOPS=1935, BW=7744KiB/s (7930kB/s)(462MiB/61143msec); 0 zone resets slat (usec): min=9, max=700161, avg=65.17, stdev=2092.54 clat (usec): min=954, max=2578.6k, avg=16457.67, stdev=109995.03 lat (usec): min=1021, max=2578.6k, avg=16523.42, stdev=110014.91 clat percentiles (usec): | 1.00th=[ 1254], 5.00th=[ 1434], 10.00th=[ 1549], | 20.00th=[ 1745], 30.00th=[ 1909], 40.00th=[ 2114], | 50.00th=[ 2376], 60.00th=[ 2704], 70.00th=[ 3228], | 80.00th=[ 4080], 90.00th=[ 8717], 95.00th=[ 53216], | 99.00th=[ 291505], 99.50th=[ 675283], 99.90th=[1669333], | 99.95th=[2231370], 99.99th=[2365588] bw ( KiB/s): min= 8, max=35968, per=100.00%, avg=9015.64, stdev=8402.84, samples=105 iops : min= 2, max= 8992, avg=2253.90, stdev=2100.72, samples=105 lat (usec) : 1000=0.01% lat (msec) : 2=34.85%, 4=44.49%, 10=11.54%, 20=1.84%, 50=1.81% lat (msec) : 100=3.27%, 250=1.13%, 500=0.42%, 750=0.19%, 1000=0.08% cpu : usr=1.42%, sys=6.63%, ctx=123309, majf=0, minf=283 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued rwts: total=0,118371,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: bw=7744KiB/s (7930kB/s), 7744KiB/s-7744KiB/s (7930kB/s-7930kB/s), io=462MiB (485MB), run=61143-61143msec Disk stats (read/write): rbd0: ios=0/118670, merge=0/9674, ticks=0/1894238, in_queue=1651008, util=33.33% On 17/8/19 8:46 am, Olivier AUDRY wrote: > Write and read with 2 hosts 4 osd : > > mkfs.ext4 /dev/rbd/kube/bench > mount /dev/rbd/kube/bench /mnt/ > dd if=/dev/zero of=test bs=8192k count=1000 oflag=direct > 8388608000 bytes (8.4 GB, 7.8 GiB) copied, 117.541 s, 71.4 MB/s > > fio -ioengine=libaio -name=test -bs=4k -iodepth=32 -rw=randwrite > -direct=1 -runtime=60 -filename=/dev/rbd/kube/bench > WRITE: bw=45.3MiB/s (47.5MB/s), 45.3MiB/s-45.3MiB/s (47.5MB/s- > 47.5MB/s), io=2718MiB (2850MB), run=60003-60003msec > > fio -ioengine=libaio -name=test -bs=4k -iodepth=32 -rw=randread > -direct=1 -runtime=60 -filename=/dev/rbd/kube/bench > READ: bw=187MiB/s (197MB/s), 187MiB/s-187MiB/s (197MB/s-197MB/s), > io=10.0GiB (10.7GB), run=54636-54636msec > > pgbench before : 10 transaction per second > pgbench after : 355 transaction per second > > So yes it's better. SSD are INTEL SSDSC2BB48 0370. > > > > Le samedi 17 août 2019 à 01:55 +0300, vitalif@xxxxxxxxxx a écrit : >>> on a new ceph cluster with the same software and config (ansible) >>> on >>> the old hardware. 2 replica, 1 host, 4 osd. >>> >>> => New hardware : 32.6MB/s READ / 10.5MiB WRITE >>> => Old hardware : 184MiB/s READ / 46.9MiB WRITE >>> >>> No discussion ? I suppose I will keep the old hardware. What do you >>> think ? :D >> In fact I don't really believe in 184 MB/s random reads with Ceph >> with 4 >> OSDs, it's a very cool result if it's true. >> >> Does the "new cluster on the old hardware" consist of only 1 host? >> Did >> you test reads before you actually wrote anything into the image so >> it >> was empty and reads were fast because of that? >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx