On Wed, 1 Feb 2017, Nick Fisk wrote: > Further update, > > I set bluestore_debug_omit_block_device_write to true and this gave me > near filestore performance, albeit still with very high write amp on the > SSD's. So its definitely something around that part of the code waiting > on the writes to the spinning disks, puzzled why the commit didn't help. > > I also set debug logging to 20/20 for bdev,bluefs,osd,bluestore and a > grep didn't reveal any of the debug in that commit eg "defering small > 0x". So possibly something isn't working as expected? Oh, yeah, the option isn't working then. I saw the line in my debug output.. are you sure you set bluestore_prefer_wal_size ? sage > > Nick > > -----Original Message----- > From: Nick Fisk [mailto:nick@xxxxxxxxxx] > Sent: 01 February 2017 19:03 > To: 'Sage Weil' <sweil@xxxxxxxxxx>; 'Mark Nelson' <mnelson@xxxxxxxxxx> > Cc: ceph-devel@xxxxxxxxxxxxxxx > Subject: RE: bluestore prefer wal size > > Hi Sage, > > First results not looking good. It looks like write IO to the SSD's (sdd and sdi) is now massively amplified, by somewhere in the region of about 10x. But I'm still only getting around 100 4kb's seq write iops from the fio client. This is in comparison to 2500-3000 iops on the SSD's and ~200 iops per spinning disk (sdc,sde,sdg,sdh). > > rbd engine: RBD version: 0.1.11 > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/476KB/0KB /s] [0/119/0 iops] [eta 00m:00s] > rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=31171: Wed Feb 1 18:56:51 2017 > write: io=27836KB, bw=475020B/s, iops=115, runt= 60006msec > slat (usec): min=6, max=142, avg=10.98, stdev= 8.38 > clat (msec): min=1, max=271, avg= 8.61, stdev= 3.63 > lat (msec): min=1, max=271, avg= 8.62, stdev= 3.63 > clat percentiles (msec): > | 1.00th=[ 8], 5.00th=[ 9], 10.00th=[ 9], 20.00th=[ 9], > | 30.00th=[ 9], 40.00th=[ 9], 50.00th=[ 9], 60.00th=[ 9], > | 70.00th=[ 9], 80.00th=[ 9], 90.00th=[ 9], 95.00th=[ 9], > | 99.00th=[ 17], 99.50th=[ 25], 99.90th=[ 33], 99.95th=[ 36], > | 99.99th=[ 273] > bw (KB /s): min= 191, max= 480, per=100.00%, avg=464.18, stdev=34.60 > lat (msec) : 2=0.01%, 4=0.04%, 10=97.69%, 20=1.55%, 50=0.69% > lat (msec) : 500=0.01% > cpu : usr=0.13%, sys=0.12%, ctx=7224, majf=0, minf=5 > IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > issued : total=r=0/w=6959/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 > latency : target=0, window=0, percentile=100.00%, depth=1 > > I've checked via the admin socket and the prefer wal option is set to 8192. > > Random capture of iostat > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sda 0.00 77.00 0.00 5.50 0.00 338.00 122.91 0.05 11.64 0.00 11.64 8.73 4.80 > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > sdc 0.00 0.00 0.00 151.50 0.00 302.00 3.99 0.54 3.59 0.00 3.59 3.58 54.20 > sdd 0.00 0.00 0.00 1474.00 0.00 4008.00 5.44 0.09 0.06 0.00 0.06 0.06 9.20 > sde 0.00 0.00 0.00 217.00 0.00 434.00 4.00 0.91 4.20 0.00 4.20 4.20 91.20 > sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > sdg 0.00 0.00 0.00 66.50 0.00 134.00 4.03 0.18 2.68 0.00 2.68 2.68 17.80 > sdh 0.00 0.00 0.00 217.00 0.00 434.00 4.00 0.80 3.71 0.00 3.71 3.71 80.40 > sdi 0.00 0.00 0.00 1134.00 0.00 3082.00 5.44 0.09 0.08 0.00 0.08 0.07 8.40 > > -----Original Message----- > From: Sage Weil [mailto:sweil@xxxxxxxxxx] > Sent: 01 February 2017 15:34 > To: nick@xxxxxxxxxx > Cc: ceph-devel@xxxxxxxxxxxxxxx > Subject: bluestore prefer wal size > > Hey Nick, > > I've updated/improved the prefer wal size PR (that sends small writes through the wal). See > > https://github.com/ceph/ceph/pull/13217 > > if you want to try it out. > > Thanks! > sage > > > > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html