Hi, does somebody known if redhat will backport new krbd features (discard, blk-mq, tcp_nodelay,...) to the redhat 3.10 kernel ? Alexandre ----- Mail original ----- De: "Mark Nelson" <mnelson@xxxxxxxxxx> À: "Nick Fisk" <nick@xxxxxxxxxx>, "Somnath Roy" <Somnath.Roy@xxxxxxxxxxx>, "aderumier" <aderumier@xxxxxxxxx> Cc: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Vendredi 6 Mars 2015 17:38:09 Objet: Re: Strange krbd behaviour with queue depths On 03/06/2015 10:27 AM, Nick Fisk wrote: > Hi Somnath, > > I think you hit the nail on the head, setting librbd to not use TCP_NODELAY shows the same behaviour as with krbd. Score (another) 1 for Somnath! :) > > Mark if you are still interested here are the two latency reports > > Queue Depth=1 > slat (usec): min=24, max=210, avg=39.40, stdev=11.54 > clat (usec): min=310, max=78268, avg=769.48, stdev=1764.41 > lat (usec): min=341, max=78298, avg=808.88, stdev=1764.39 > clat percentiles (usec): > | 1.00th=[ 462], 5.00th=[ 466], 10.00th=[ 474], 20.00th=[ 620], > | 30.00th=[ 620], 40.00th=[ 628], 50.00th=[ 636], 60.00th=[ 772], > | 70.00th=[ 772], 80.00th=[ 788], 90.00th=[ 924], 95.00th=[ 940], > | 99.00th=[ 1080], 99.50th=[ 1384], 99.90th=[33536], 99.95th=[45312], > | 99.99th=[63232] > bw (KB /s): min= 2000, max= 5880, per=100.00%, avg=4951.16, stdev=877.96 > lat (usec) : 500=12.71%, 750=40.82%, 1000=45.41% > lat (msec) : 2=0.69%, 4=0.13%, 10=0.04%, 20=0.04%, 50=0.11% > lat (msec) : 100=0.05% > > Queue Depth =2 > slat (usec): min=21, max=135, avg=38.72, stdev=13.18 > clat (usec): min=346, max=77340, avg=6450.22, stdev=13390.20 > lat (usec): min=377, max=77368, avg=6488.94, stdev=13389.56 > clat percentiles (usec): > | 1.00th=[ 462], 5.00th=[ 470], 10.00th=[ 498], 20.00th=[ 612], > | 30.00th=[ 628], 40.00th=[ 652], 50.00th=[ 684], 60.00th=[ 772], > | 70.00th=[ 820], 80.00th=[ 996], 90.00th=[37120], 95.00th=[38656], > | 99.00th=[40192], 99.50th=[40704], 99.90th=[45312], 99.95th=[64768], > | 99.99th=[77312] > bw (KB /s): min= 931, max= 1611, per=99.42%, avg=1223.84, stdev=186.30 > lat (usec) : 500=11.37%, 750=42.60%, 1000=26.11% > lat (msec) : 2=3.37%, 4=0.71%, 10=0.16%, 20=0.16%, 50=15.45% > lat (msec) : 100=0.06% Pretty similar latency except for that big 50ms spike at QD=2! > > Many Thanks, > Nick > >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of >> Somnath Roy >> Sent: 06 March 2015 16:02 >> To: Alexandre DERUMIER; Nick Fisk >> Cc: ceph-users >> Subject: Re: Strange krbd behaviour with queue depths >> >> Nick, >> I think this is because of the krbd you are using is using Naggle's algorithm i.e >> TCP_NODELAY = false by default. >> The latest krbd module should have the TCP_NODELAY = true by default. >> You may want to try that. But, I think it is available in the latest kernel only. >> Librbd is running with TCP_NODELAY = true by default, you may want to try >> with ms_tcp_nodelay = false to simulate the similar behavior with librbd. >> >> Thanks & Regards >> Somnath >> >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of >> Alexandre DERUMIER >> Sent: Friday, March 06, 2015 3:59 AM >> To: Nick Fisk >> Cc: ceph-users >> Subject: Re: Strange krbd behaviour with queue depths >> >> Hi, do you have tried with differents io schedulers to compare ? >> >> >> ----- Mail original ----- >> De: "Nick Fisk" <nick@xxxxxxxxxx> >> À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> >> Envoyé: Jeudi 5 Mars 2015 18:17:27 >> Objet: Strange krbd behaviour with queue depths >> >> >> >> I’m seeing a strange queue depth behaviour with a kernel mapped RBD, >> librbd does not show this problem. >> >> >> >> Cluster is comprised of 4 nodes, 10GB networking, not including OSDs as test >> sample is small so fits in page cache. >> >> >> >> Running fio against a kernel mapped RBD >> >> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test >> --filename=/dev/rbd/cache1/test2 --bs=4k --readwrite=randread -- >> iodepth=1 --runtime=10 --size=1g >> >> >> >> Queue Depth >> >> IOPS >> >> >> 1 >> >> 2021 >> >> >> 2 >> >> 288 >> >> >> 4 >> >> 376 >> >> >> 8 >> >> 601 >> >> >> 16 >> >> 1272 >> >> >> 32 >> >> 2467 >> >> >> 64 >> >> 16901 >> >> >> 128 >> >> 44060 >> >> >> >> See how initially I get a very high number of IOs at queue depth 1, but this >> drops dramatically as soon as I start increasing the queue depth. It’s not until >> a depth or 32 IOs that I start to get similar performance. Incidentally when >> changing the read type to sequential instead of random the oddity goes >> away. >> >> >> >> Running fio with librbd engine and the same test options I get the following >> >> >> >> Queue Depth >> >> IOPS >> >> >> 1 >> >> 1492 >> >> >> 2 >> >> 3232 >> >> >> 4 >> >> 7099 >> >> >> 8 >> >> 13875 >> >> >> 16 >> >> 18759 >> >> >> 32 >> >> 17998 >> >> >> 64 >> >> 18104 >> >> >> 128 >> >> 18589 >> >> >> >> >> >> As you can see the performance scales up nicely, although the top end IO’s >> seem limited to around 18k. I don’t know if this is due to kernel/userspace >> performance differences or if there is a lower max queue depth limit in >> librbd. >> >> >> >> Both tests were run on a small sample size to force the OSD data into page >> cache to rule out any device latency. >> >> >> >> Does anyone know why kernel mapped RBD’s show this weird behaviour? I >> don’t think it can be OSD/ceph config related as it only happens with krbd’s. >> >> >> >> Nick >> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> ________________________________ >> >> PLEASE NOTE: The information contained in this electronic mail message is >> intended only for the use of the designated recipient(s) named above. If the >> reader of this message is not the intended recipient, you are hereby notified >> that you have received this message in error and that any review, >> dissemination, distribution, or copying of this message is strictly prohibited. If >> you have received this communication in error, please notify the sender by >> telephone or e-mail (as shown above) immediately and destroy any and all >> copies of this message in your possession (whether hard copies or >> electronically stored copies). >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com