I guess we have a lot of qemu performance problem related mails in the ML. You may get insight from their discusses. You may expect to run rbd bench-write to see how many iops you can get outside vm On Thu, Nov 19, 2015 at 6:46 PM, Sean Redmond <sean.redmond1@xxxxxxxxx> wrote: > Hi Mike/Warren, > > Thanks for helping out here. I am running the below fio command to test this > with 4 jobs and a iodepth of 128 > > fio --time_based --name=benchmark --size=4G --filename=/mnt/test.bin > --ioengine=libaio --randrepeat=0 --iodepth=128 --direct=1 --invalidate=1 > --verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k > --group_reportin > > The QEMU instance is created using nova, the settings I can see in the > config are below: > > <disk type='network' device='disk'> > <driver name='qemu' type='raw' cache='writeback'/> > <auth username='$$'> > <secret type='ceph' uuid='$$'/> > </auth> > <source protocol='rbd' name='ssd_volume/volume-$$'> > <host name='$$' port='6789'/> > <host name='$$' port='6789'/> > <host name='$$' port='6789'/> > </source> > <target dev='vde' bus='virtio'/> > <serial>$$</serial> > <address type='pci' domain='0x0000' bus='0x00' slot='0x09' > function='0x0'/> > </disk> > > > The below shows the output from running Fio: > > # fio --time_based --name=benchmark --size=4G --filename=/mnt/test.bin > --ioengine=libaio --randrepeat=0 --iodepth=128 --direct=1 --invalidate=1 > --verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k > --group_reporting > fio: time_based requires a runtime/timeout setting > benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, > iodepth=128 > ... > benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, > iodepth=128 > fio-2.0.13 > Starting 4 processes > Jobs: 3 (f=3): [_www] [99.7% done] [0K/36351K/0K /s] [0 /9087 /0 iops] [eta > 00m:03s] > benchmark: (groupid=0, jobs=4): err= 0: pid=8547: Thu Nov 19 05:16:31 2015 > write: io=16384MB, bw=19103KB/s, iops=4775 , runt=878269msec > slat (usec): min=4 , max=2339.4K, avg=807.17, stdev=12460.02 > clat (usec): min=1 , max=2469.6K, avg=106265.05, stdev=138893.39 > lat (usec): min=67 , max=2469.8K, avg=107073.04, stdev=139377.68 > clat percentiles (usec): > | 1.00th=[ 1928], 5.00th=[ 9408], 10.00th=[12352], 20.00th=[18816], > | 30.00th=[43776], 40.00th=[64768], 50.00th=[78336], 60.00th=[89600], > | 70.00th=[102912], 80.00th=[123392], 90.00th=[216064], > 95.00th=[370688], > | 99.00th=[733184], 99.50th=[782336], 99.90th=[1044480], > 99.95th=[2088960], > | 99.99th=[2342912] > bw (KB/s) : min= 4, max=14968, per=26.11%, avg=4987.39, > stdev=1947.67 > lat (usec) : 2=0.01%, 20=0.01%, 50=0.01%, 100=0.05%, 250=0.30% > lat (usec) : 500=0.24%, 750=0.11%, 1000=0.08% > lat (msec) : 2=0.23%, 4=0.46%, 10=4.47%, 20=15.08%, 50=11.28% > lat (msec) : 100=35.47%, 250=23.52%, 500=5.92%, 750=1.96%, 1000=0.70% > lat (msec) : 2000=0.06%, >=2000=0.06% > cpu : usr=0.62%, sys=2.42%, ctx=1602209, majf=1, minf=101 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >>=64=100.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>=64=0.1% > issued : total=r=0/w=4194304/d=0, short=r=0/w=0/d=0 > > Run status group 0 (all jobs): > WRITE: io=16384MB, aggrb=19102KB/s, minb=19102KB/s, maxb=19102KB/s, > mint=878269msec, maxt=878269msec > > Disk stats (read/write): > vde: ios=1119/4330437, merge=0/105599, ticks=556/121755054, > in_queue=121749666, util=99.86 > > The below shows lspci from within the guest: > > # lspci | grep -i scsi > 00:04.0 SCSI storage controller: Red Hat, Inc Virtio block devic > > Thanks > > On Wed, Nov 18, 2015 at 7:05 PM, Warren Wang - ISD <Warren.Wang@xxxxxxxxxxx> > wrote: >> >> What were you using for iodepth and numjobs? If you’re getting an average >> of 2ms per operation, and you’re single threaded, I’d expect about 500 IOPS >> / thread, until you hit the limit of your QEMU setup, which may be a single >> IO thread. That’s also what I think Mike is alluding to. >> >> Warren >> >> From: Sean Redmond >> <sean.redmond1@xxxxxxxxx<mailto:sean.redmond1@xxxxxxxxx>> >> Date: Wednesday, November 18, 2015 at 6:39 AM >> To: "ceph-users@xxxxxxxx<mailto:ceph-users@xxxxxxxx>" >> <ceph-users@xxxxxxxx<mailto:ceph-users@xxxxxxxx>> >> Subject: All SSD Pool - Odd Performance >> >> Hi, >> >> I have a performance question for anyone running an SSD only pool. Let me >> detail the setup first. >> >> 12 X Dell PowerEdge R630 ( 2 X 2620v3 64Gb RAM) >> 8 X intel DC 3710 800GB >> Dual port Solarflare 10GB/s NIC (one front and one back) >> Ceph 0.94.5 >> Ubuntu 14.04 (3.13.0-68-generic) >> >> The above is in one pool that is used for QEMU guests, A 4k FIO test on >> the SSD directly yields around 55k Iops, the same test inside a QEMU guest >> seems to hit a limit around 4k Iops. If I deploy multiple guests they can >> all reach 4K Iops simultaneously. >> >> I don't see any evidence of a bottle neck on the OSD hosts,Is this limit >> inside the guest expected or I am just not looking deep enough yet? >> >> Thanks >> >> This email and any files transmitted with it are confidential and intended >> solely for the individual or entity to whom they are addressed. If you have >> received this email in error destroy it immediately. *** Walmart >> Confidential *** > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Best Regards, Wheat _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com