Ok I read your link. My ssds are bad. They got capacitors ... I don't choose them. They come with the hardware I rent. Perhaps it will be better to switch to hdd. I cannot even but journal on them ... bad news :( Le vendredi 16 août 2019 à 17:37 +0200, Olivier AUDRY a écrit : > hello > > here on the nvme partition directly > > - libaio randwrite /dev/nvme1n1p4 => WRITE: bw=12.1MiB/s (12.7MB/s), > 12.1MiB/s-12.1MiB/s (12.7MB/s-12.7MB/s), io=728MiB (763MB), > run=60001-60001msec > - libaio randread /dev/nvme1n1p4 => READ: bw=35.6MiB/s (37.3MB/s), > 35.6MiB/s-35.6MiB/s (37.3MB/s-37.3MB/s), io=2134MiB (2237MB), > run=60001-60001msec > > here on the rbd > > - rbd read : READ: bw=580MiB/s (608MB/s), 580MiB/s-580MiB/s (608MB/s- > 608MB/s), io=10.0GiB (10.7GB), run=17668-17668msec (I want this perf > ! :) ) > - rbd write : WRITE: bw=90.9MiB/s (95.3MB/s), 90.9MiB/s-90.9MiB/s > (95.3MB/s-95.3MB/s), io=5764MiB (6044MB), run=63404-63404msec (I want > this perf ! :) ) > > here on the mapped rbd > > - libaio randwrite on mapped rbd : WRITE: bw=217KiB/s (223kB/s), > 217KiB/s-217KiB/s (223kB/s-223kB/s), io=12.7MiB (13.4MB), run=60006- > 60006msec > - libaio randread on mapped rbd : READ: bw=589KiB/s (603kB/s), > 589KiB/s-589KiB/s (603kB/s-603kB/s), io=34.5MiB (36.2MB), run=60005- > 60005msec > > here on the mounted fs : > > rbd map bench --pool kube --name client.admin > /sbin/mkfs.ext4 /dev/rbd/kube/bench > mount /dev/rbd/kube/bench /mnt/ > cd /mnt/ > dd if=/dev/zero of=test bs=8192k count=100 oflag=direct > 838860800 bytes (839 MB, 800 MiB) copied, 24.5338 s, 34.2 MB/s > > raw nvme performance does not looks very great .... but raw > performance of the rbd are great > Once I map it. The performance goes bad. When I use dd in the fs or > fio on the device. > > what I don't understand is why the difference between raw rbd and > mapped rbd are so important. > > The nvme disk looks to be : Sandisk Corp WD Black 2018/PC SN720 NVMe > SSD > > Rand write : > fio -ioengine=libaio -name=test -bs=4k -iodepth=1 -direct=1 -fsync=1 > -rw=randwrite -runtime=60 -filename=/dev/nvme1n1p4 > test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) > 4096B-4096B, ioengine=libaio, iodepth=1 > fio-3.12 > Starting 1 process > Jobs: 1 (f=1): [w(1)][100.0%][w=12.3MiB/s][w=3158 IOPS][eta 00m:00s] > test: (groupid=0, jobs=1): err= 0: pid=177938: Fri Aug 16 17:18:18 > 2019 > write: IOPS=3105, BW=12.1MiB/s (12.7MB/s)(728MiB/60001msec); 0 zone > resets > slat (nsec): min=1545, max=141856, avg=6784.48, stdev=6188.10 > clat (nsec): min=688, max=3539.6k, avg=14665.22, stdev=13601.60 > lat (usec): min=9, max=3549, avg=21.66, stdev=16.07 > clat percentiles (usec): > | 1.00th=[ 8], 5.00th=[ 8], 10.00th=[ 9], > 20.00th=[ 9], > | 30.00th=[ 10], 40.00th=[ 10], 50.00th=[ 12], > 60.00th=[ 14], > | 70.00th=[ 16], 80.00th=[ 19], 90.00th=[ 26], > 95.00th=[ 36], > | 99.00th=[ 49], 99.50th=[ 52], 99.90th=[ 76], > 99.95th=[ 85], > | 99.99th=[ 135] > bw ( KiB/s): min=10504, max=13232, per=100.00%, avg=12420.01, > stdev=439.86, samples=119 > iops : min= 2626, max= 3308, avg=3105.00, stdev=109.96, > samples=119 > lat (nsec) : 750=0.01%, 1000=0.01% > lat (usec) : 2=0.02%, 4=0.09%, 10=40.60%, 20=43.62%, 50=14.92% > lat (usec) : 100=0.73%, 250=0.02%, 500=0.01%, 750=0.01% > lat (msec) : 4=0.01% > fsync/fdatasync/sync_file_range: > sync (nsec): min=12, max=22490, avg=185.20, stdev=255.75 > sync percentiles (nsec): > | 1.00th=[ 28], 5.00th=[ 35], 10.00th=[ 42], > 20.00th=[ 59], > | 30.00th=[ 82], 40.00th=[ 109], 50.00th=[ 137], > 60.00th=[ 163], > | 70.00th=[ 197], 80.00th=[ 253], 90.00th=[ 390], > 95.00th=[ 572], > | 99.00th=[ 804], 99.50th=[ 820], 99.90th=[ 1144], 99.95th=[ > 1208], > | 99.99th=[15552] > cpu : usr=2.42%, sys=4.91%, ctx=546845, majf=0, minf=12 > IO depths : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=0,186313,0,186313 short=0,0,0,0 > dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > Run status group 0 (all jobs): > WRITE: bw=12.1MiB/s (12.7MB/s), 12.1MiB/s-12.1MiB/s (12.7MB/s- > 12.7MB/s), io=728MiB (763MB), run=60001-60001msec > > Disk stats (read/write): > nvme1n1: ios=0/375662, merge=0/1492, ticks=0/57478, in_queue=59100, > util=97.85% > > > RBD read > > ~# fio -ioengine=rbd -name=test -bs=4M -iodepth=32 -rw=read > -runtime=60 -pool=kube -rbdname=bench > test: (g=0): rw=read, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, > (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=32 > fio-3.12 > Starting 1 process > Jobs: 1 (f=1): [R(1)][69.2%][r=1313MiB/s][r=328 IOPS][eta 00m:08s] > test: (groupid=0, jobs=1): err= 0: pid=184386: Fri Aug 16 17:20:45 > 2019 > read: IOPS=144, BW=580MiB/s (608MB/s)(10.0GiB/17668msec) > slat (nsec): min=371, max=72009, avg=5416.87, stdev=3649.19 > clat (msec): min=6, max=1438, avg=220.83, stdev=192.27 > lat (msec): min=6, max=1438, avg=220.83, stdev=192.27 > clat percentiles (msec): > | 1.00th=[ 19], 5.00th=[ 23], 10.00th=[ 25], > 20.00th=[ 30], > | 30.00th=[ 33], 40.00th=[ 47], 50.00th=[ 249], > 60.00th=[ 288], > | 70.00th=[ 326], 80.00th=[ 372], 90.00th=[ 447], > 95.00th=[ 535], > | 99.00th=[ 802], 99.50th=[ 844], 99.90th=[ 936], 99.95th=[ > 1011], > | 99.99th=[ 1435] > bw ( KiB/s): min=253952, max=4628480, per=93.74%, avg=556353.83, > stdev=851591.15, samples=35 > iops : min= 62, max= 1130, avg=135.83, stdev=207.91, > samples=35 > lat (msec) : 10=0.08%, 20=2.03%, 50=38.40%, 100=2.15%, 250=7.62% > lat (msec) : 500=43.48%, 750=4.73%, 1000=1.45% > cpu : usr=0.19%, sys=0.24%, ctx=2563, majf=0, minf=7 > IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=98.8%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, > >=64=0.0% > issued rwts: total=2560,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=32 > > Run status group 0 (all jobs): > READ: bw=580MiB/s (608MB/s), 580MiB/s-580MiB/s (608MB/s-608MB/s), > io=10.0GiB (10.7GB), run=17668-17668msec > > Disk stats (read/write): > md2: ios=1/1405, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, > aggrios=1332/3191, aggrmerge=5911/5187, aggrticks=1031/1257, > aggrin_queue=15480, aggrutil=91.95% > nvme1n1: ios=0/1183, merge=0/450, ticks=0/141, in_queue=12452, > util=69.94% > nvme0n1: ios=2665/5199, merge=11822/9925, ticks=2062/2373, > in_queue=18508, util=91.95% > > RBD write > > fio -ioengine=rbd -name=test -bs=4M -iodepth=32 -rw=write > -runtime=60 -pool=kube -rbdname=bench > test: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, > (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=32 > fio-3.12 > Starting 1 process > Jobs: 1 (f=1): [W(1)][55.3%][eta 00m:51s] > test: (groupid=0, jobs=1): err= 0: pid=181899: Fri Aug 16 17:20:19 > 2019 > write: IOPS=22, BW=90.9MiB/s (95.3MB/s)(5764MiB/63404msec); 0 zone > resets > slat (usec): min=366, max=4889, avg=802.71, stdev=414.92 > clat (msec): min=189, max=7308, avg=1404.62, stdev=1278.48 > lat (msec): min=189, max=7308, avg=1405.43, stdev=1278.47 > clat percentiles (msec): > | 1.00th=[ 292], 5.00th=[ 376], 10.00th=[ 435], > 20.00th=[ 550], > | 30.00th=[ 676], 40.00th=[ 776], 50.00th=[ 877], 60.00th=[ > 1083], > | 70.00th=[ 1351], 80.00th=[ 1921], 90.00th=[ 3473], 95.00th=[ > 4597], > | 99.00th=[ 5470], 99.50th=[ 6007], 99.90th=[ 6745], 99.95th=[ > 7282], > | 99.99th=[ 7282] > bw ( KiB/s): min= 8192, max=155648, per=100.00%, avg=95454.68, > stdev=30757.07, samples=121 > iops : min= 2, max= 38, avg=23.27, stdev= 7.52, > samples=121 > lat (msec) : 250=0.07%, 500=15.82%, 750=22.21%, 1000=18.74% > cpu : usr=1.71%, sys=0.15%, ctx=597, majf=0, minf=45137 > IO depths : 1=0.1%, 2=0.1%, 4=0.3%, 8=0.6%, 16=1.1%, 32=97.8%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, > >=64=0.0% > issued rwts: total=0,1441,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=32 > > Run status group 0 (all jobs): > WRITE: bw=90.9MiB/s (95.3MB/s), 90.9MiB/s-90.9MiB/s (95.3MB/s- > 95.3MB/s), io=5764MiB (6044MB), run=63404-63404msec > > Disk stats (read/write): > md2: ios=0/5119, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, > aggrios=30/16383, aggrmerge=20/28021, aggrticks=5/18949, > aggrin_queue=63466, aggrutil=90.70% > nvme1n1: ios=0/4230, merge=0/1595, ticks=0/1835, in_queue=42440, > util=65.48% > nvme0n1: ios=60/28536, merge=41/54447, ticks=10/36063, > in_queue=84492, util=90.70% > > fio -ioengine=libaio -name=test -bs=4k -iodepth=1 -direct=1 -fsync=1 > -rw=randread -runtime=60 -filename=/dev/nvme1n1p4 > test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) > 4096B-4096B, ioengine=libaio, iodepth=1 > fio-3.12 > Starting 1 process > Jobs: 1 (f=1): [r(1)][100.0%][r=36.3MiB/s][r=9286 IOPS][eta 00m:00s] > test: (groupid=0, jobs=1): err= 0: pid=208060: Fri Aug 16 17:33:01 > 2019 > read: IOPS=9103, BW=35.6MiB/s (37.3MB/s)(2134MiB/60001msec) > slat (nsec): min=1384, max=244751, avg=6077.86, stdev=5529.26 > clat (usec): min=3, max=8311, avg=101.98, stdev=42.25 > lat (usec): min=34, max=8341, avg=108.28, stdev=43.81 > clat percentiles (usec): > | 1.00th=[ 73], 5.00th=[ 83], 10.00th=[ 85], > 20.00th=[ 90], > | 30.00th=[ 92], 40.00th=[ 94], 50.00th=[ 96], > 60.00th=[ 100], > | 70.00th=[ 104], 80.00th=[ 112], 90.00th=[ 121], > 95.00th=[ 141], > | 99.00th=[ 182], 99.50th=[ 198], 99.90th=[ 253], > 99.95th=[ 297], > | 99.99th=[ 2147] > bw ( KiB/s): min=33088, max=40928, per=99.96%, avg=36400.00, > stdev=1499.29, samples=119 > iops : min= 8272, max=10232, avg=9099.98, stdev=374.82, > samples=119 > lat (usec) : 4=0.01%, 10=0.01%, 50=0.53%, 100=59.74%, 250=39.62% > lat (usec) : 500=0.08%, 750=0.01%, 1000=0.01% > lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% > cpu : usr=4.80%, sys=8.33%, ctx=546238, majf=0, minf=10 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, > >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=546228,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > Run status group 0 (all jobs): > READ: bw=35.6MiB/s (37.3MB/s), 35.6MiB/s-35.6MiB/s (37.3MB/s- > 37.3MB/s), io=2134MiB (2237MB), run=60001-60001msec > > Disk stats (read/write): > nvme1n1: ios=545111/3564, merge=0/1430, ticks=54388/943, > in_queue=60924, util=100.00% > > Le vendredi 16 août 2019 à 18:06 +0300, vitalif@xxxxxxxxxx a écrit : > > Now to go for "apples to apples" either run > > > > fio -ioengine=libaio -name=test -bs=4k -iodepth=1 -direct=1 > > -fsync=1 > > -rw=randwrite -runtime=60 -filename=/dev/nvmeXXXXXXXXX > > > > to compare with the single-threaded RBD random write result (the > > test is > > destructive, so use a separate partition without data) > > > > ...Or run > > > > fio -ioengine=rbd -name=test -bs=4M -iodepth=32 -rw=write > > -runtime=60 > > -pool=kube -rbdname=bench > > > > to compare with your dd's linear write result. > > > > 58 single-threaded random iops for NVMes is pretty sad either way. > > Are > > you NVMe's server ones? Do they have capacitors? :) in the response > > to a > > likely question about what they are I'll just post my link here > > again :) > > https://yourcmc.ru/wiki/Ceph_performance > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx