Hi,
On 12/07/16 13:52, Christian Balzer wrote: On Wed, 7 Dec 2016 12:39:11 +0100 Christian Theune wrote:
| cartman06 ~ # fio --filename=/dev/sdl --direct=1 --sync=1 --rw=write --bs=128k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test | journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 | fio-2.0.14 | Starting 1 process | Jobs: 1 (f=1): [W] [100.0% done] [0K/88852K/0K /s] [0 /22.3K/0 iops] [eta 00m:00s] | journal-test: (groupid=0, jobs=1): err= 0: pid=28606: Wed Dec 7 11:59:36 2016 | write: io=5186.7MB, bw=88517KB/s, iops=22129 , runt= 60001msec | clat (usec): min=37 , max=1519 , avg=43.77, stdev=10.89 | lat (usec): min=37 , max=1519 , avg=43.94, stdev=10.90 | clat percentiles (usec): | | 1.00th=[ 39], 5.00th=[ 40], 10.00th=[ 40], 20.00th=[ 41], | | 30.00th=[ 41], 40.00th=[ 42], 50.00th=[ 42], 60.00t848/h=[ 42], | | 70.00th=[ 43], 80.00th=[ 44], 90.00th=[ 47], 95.00th=[ 53], | | 99.00th=[ 71], 99.50th=[ 87], 99.90th=[ 157], 99.95th=[ 201], | | 99.99th=[ 478] | bw (KB/s) : min=81096, max=91312, per=100.00%, avg=88519.19, stdev=1762.43 | lat (usec) : 50=92.42%, 100=7.28%, 250=0.27%, 500=0.02%, 750=0.01% | lat (usec) : 1000=0.01% | lat (msec) : 2=0.01% | cpu : usr=5.43%, sys=14.64%, ctx=1327888, majf=0, minf=6 | IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% | complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% | issued : total=r=0/w=1327777/d=0, short=r=0/w=0/d=0 | | Run status group 0 (all jobs): | WRITE: io=5186.7MB, aggrb=88516KB/s, minb=88516KB/s, maxb=88516KB/s, mint=60001msec, maxt=60001msec | | Disk stats (read/write): | sdl: ios=15/1326283, merge=0/0, ticks=1/47203, in_queue=46970, util=78.29%
That doesn’t look too bad to me, specifically the 99.99th of 478 microseconds seems fine.
The iostat during this run looks OK as well:
Both do look pretty bad to to me.
Your SSD with a nominal write speed of 850MB/s is doing 88MB/s at 80% utilization. The puny 400GB DC S3610 in my example earlier can do 400MB/s per Intel specs and was at 70% with 300MB/s (so half of it journal writes!). My experience with Intel SSDs (as mentioned before in this ML) is that their stated speeds can actually be achieved within about a 10% margin when used with Ceph, be it for pure journaling or as OSDs with inline journals.
I don't see how this makes any sense. Could you correct or explain it so it does? - 300MB/s at 4k is like 77k iops. The Intel 400GB DC S3610 spec[1] says it does 25k. So I think you should be more specific in how you tested it.
- His ssd is rated at 15k random write ops[2], so it's exceeding that by a bunch (both reported by iostat and fio around 22k) (but they don't list a sequential rating)
The sequential tests are lacking. I was able to produce ~520MB/s read with 130k IOPS. From an IOPS perspective this seems fine, but the bandwidth is still lacking (although with 500 MB/s we’re slowly getting closer to the 6GBit limit.
I haven’t been able to push the drive to higher bandwidth on the writing end. I did a full conditioning run which maxed out for a while around 200MB/s and then dropped to around 100MB/s where it’s been staying now. I’m surprised in the sense that the reviews I found did explicitly show achieving >500MB sustainable sequential write. Either I’m getting screwed as this not being _exactly_ the same drive as in the test (possible) or something else is off (also possible). I’m contacting Micron now - let’s see what kind of odyssee that will cause. ;)
As another step I’m evaluating whether I have options available to put the drive in a location where I can bypass the RAID controller, just to make sure. - his command says bs=128k, but the output says 4k ... so he didn't really run that command for that result, or it's bugged. (is this where the confusion lies?)
Yikes. That was a copy/paste error from my terminal to my work log. I think I “verschlimmbessert” this when I thought it was the wrong command line in the first place. This was a 4k run, the other options being identical - as described by Sebastian. - also note he didn't set -ioengine=... so depending on how the default changes per version, others could be comparing psync or others to his ioengine=sync, so that should be specifically stated for comparing results.
My version says ioengine=sync as default. Didn’t know this varies over versions and should be specified, sorry.
Cheers, Christian
-- Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0 Flying Circus Internet Operations GmbH · http://flyingcircus.ioForsterstraße 29 · 06112 Halle (Saale) · Deutschland HR Stendal HRB 21169 · Geschäftsführer: Christian. Theune, Christian. Zagrodnick
|