Hi, I’m now working with the raw device and getting interesting results. For one, I went through all reviews about the Micron DC S610 again and as always the devil is in the detail. I noticed that the test results are quite favorable, but I didn’t previously notice the caveat (which applies to SSDs in general) that precondition may be in order. The Micron in their tests shows quite extreme initial max latency until preconditioning settles. I can relate to that as the SSDs that I put into the cluster last Friday (5 days ago) have quite different characteristics in my statistics compared to the ones I added this Monday evening (2 days ago). I took one of the early ones and evacuated the OSD to perform tests. Sebastian’s fio call for testing journal ability ended up like this at the current time: | cartman06 ~ # fio --filename=/dev/sdl --direct=1 --sync=1 --rw=write --bs=128k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test | journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 | fio-2.0.14 | Starting 1 process | Jobs: 1 (f=1): [W] [100.0% done] [0K/88852K/0K /s] [0 /22.3K/0 iops] [eta 00m:00s] | journal-test: (groupid=0, jobs=1): err= 0: pid=28606: Wed Dec 7 11:59:36 2016 | write: io=5186.7MB, bw=88517KB/s, iops=22129 , runt= 60001msec | clat (usec): min=37 , max=1519 , avg=43.77, stdev=10.89 | lat (usec): min=37 , max=1519 , avg=43.94, stdev=10.90 | clat percentiles (usec): | | 1.00th=[ 39], 5.00th=[ 40], 10.00th=[ 40], 20.00th=[ 41], | | 30.00th=[ 41], 40.00th=[ 42], 50.00th=[ 42], 60.00t848/h=[ 42], | | 70.00th=[ 43], 80.00th=[ 44], 90.00th=[ 47], 95.00th=[ 53], | | 99.00th=[ 71], 99.50th=[ 87], 99.90th=[ 157], 99.95th=[ 201], | | 99.99th=[ 478] | bw (KB/s) : min=81096, max=91312, per=100.00%, avg=88519.19, stdev=1762.43 | lat (usec) : 50=92.42%, 100=7.28%, 250=0.27%, 500=0.02%, 750=0.01% | lat (usec) : 1000=0.01% | lat (msec) : 2=0.01% | cpu : usr=5.43%, sys=14.64%, ctx=1327888, majf=0, minf=6 | IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% | complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% | issued : total=r=0/w=1327777/d=0, short=r=0/w=0/d=0 | | Run status group 0 (all jobs): | WRITE: io=5186.7MB, aggrb=88516KB/s, minb=88516KB/s, maxb=88516KB/s, mint=60001msec, maxt=60001msec | | Disk stats (read/write): | sdl: ios=15/1326283, merge=0/0, ticks=1/47203, in_queue=46970, util=78.29% That doesn’t look too bad to me, specifically the 99.99th of 478 microseconds seems fine. The iostat during this run looks OK as well: | cartman06 ~ # iostat -x 5 sdl | Linux 4.4.27-gentoo (cartman06) 12/07/2016 _x86_64_ (24 CPU) | | avg-cpu: %user %nice %system %iowait %steal %idle | 5.70 0.05 3.09 5.09 0.00 86.07 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 2.92 31.24 148.16 1851.99 8428.66 114.61 1.61 9.00 0.68 10.75 0.22 4.03 | | avg-cpu: %user %nice %system %iowait %steal %idle | 3.31 0.04 1.97 1.48 0.00 93.19 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.02 0.03 2.38 1.44 0.00 92.13 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 12.40 3101.40 92.80 12405.60 8.03 0.11 0.04 0.10 0.04 0.04 11.12 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.02 0.05 3.57 4.78 0.00 87.58 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22166.20 0.00 88664.80 8.00 0.80 0.04 0.00 0.04 0.04 79.58 | | avg-cpu: %user %nice %system %iowait %steal %idle | 3.64 0.05 2.77 4.98 0.00 88.56 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22304.20 0.00 89216.80 8.00 0.78 0.04 0.00 0.04 0.04 78.08 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.89 0.05 2.97 11.15 0.00 80.93 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22022.00 0.00 88088.00 8.00 0.79 0.04 0.00 0.04 0.04 78.68 | | avg-cpu: %user %nice %system %iowait %steal %idle | 3.45 0.04 2.74 4.24 0.00 89.53 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22182.60 0.00 88730.40 8.00 0.78 0.04 0.00 0.04 0.04 77.66 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.21 0.04 2.51 3.40 0.00 89.83 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22392.00 0.00 89568.00 8.00 0.79 0.04 0.00 0.04 0.04 79.26 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.94 0.04 3.35 3.40 0.00 88.26 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22078.40 0.00 88313.60 8.00 0.79 0.04 0.00 0.04 0.04 78.70 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.43 0.04 3.02 4.68 0.00 87.83 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22141.60 0.00 88566.40 8.00 0.77 0.04 0.00 0.04 0.03 77.24 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.16 0.04 2.82 4.66 0.00 88.32 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22177.00 0.00 88708.00 8.00 0.78 0.04 0.00 0.04 0.04 78.24 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.09 0.03 3.02 12.34 0.00 80.52 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22156.60 0.00 88626.40 8.00 0.78 0.04 0.00 0.04 0.04 78.36 | | avg-cpu: %user %nice %system %iowait %steal %idle | 5.43 0.04 3.38 4.07 0.00 87.08 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 22298.80 0.00 89195.20 8.00 0.77 0.03 0.00 0.03 0.03 77.36 | | avg-cpu: %user %nice %system %iowait %steal %idle | 7.33 0.05 4.42 4.58 0.00 83.62 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 0.00 21905.20 0.00 87620.80 8.00 0.79 0.04 0.00 0.04 0.04 79.20 | | avg-cpu: %user %nice %system %iowait %steal %idle | 4.91 0.03 3.52 3.39 0.00 88.15 | | Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util | sdl 0.00 0.00 12.40 18629.40 92.80 74517.60 8.00 0.67 0.04 0.10 0.04 0.04 67.18 I’m now running fio --filename=/dev/sdl --rw=write --bs=128k --numjobs=1 --iodepth=32 --group_reporting --name=journal-test to condition the device fully. After that I’ll perform some more tests based on mixed loads.
Cheers, Christian -- Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0 Flying Circus Internet Operations GmbH · http://flyingcircus.io Forsterstraße 29 · 06112 Halle (Saale) · Deutschland HR Stendal HRB 21169 · Geschäftsführer: Christian. Theune, Christian. Zagrodnick |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com