performance tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 09 Jul 2014 07:07:50 -0500 Mark Nelson wrote:

> On 07/09/2014 06:52 AM, Xabier Elkano wrote:
> > El 09/07/14 13:10, Mark Nelson escribi?:
> >> On 07/09/2014 05:57 AM, Xabier Elkano wrote:
> >>>
> >>>
> >>> Hi,
> >>>
> >>> I was doing some tests in my cluster with fio tool, one fio instance
> >>> with 70 jobs, each job writing 1GB random with 4K block size. I did
> >>> this test with 3 variations:
> >>>
> >>> 1- Creating 70 images, 60GB each, in the pool. Using rbd kernel
> >>> module, format and mount each image as ext4. Each fio job writing in
> >>> a separate image/directory. (ioengine=libaio, queue_depth=4,
> >>> direct=1)
> >>>
> >>>      IOPS: 6542
> >>>      AVG LAT: 41ms
> >>>
> >>> 2- Creating 1 large image 4,2TB in the pool. Using rbd kernel module,
> >>> format and mount the image as ext4. Each fio job writing in a
> >>> separate file in the same directory. (ioengine=libaio,
> >>> queue_depth=4,direct=1)
> >>>
> >>>     IOPS: 5899
> >>>     AVG LAT:  47ms
> >>>
> >>> 3- Creating 1 large image 4,2TB in the pool. Use ioengine rbd in fio
> >>> to access the image through librados. (ioengine=rbd,
> >>> queue_depth=4,direct=1)
> >>>
> >>>     IOPS: 2638
> >>>     AVG LAT: 96ms
> >>>
> >>> Do these results make sense? From Ceph perspective, It is better to
> >>> have many small images than a larger one? What is the best approach
> >>> to simulate the workload of 70 VMs?
> >>
> >> I'm not sure the difference between the first two cases is enough to
> >> say much yet.  You may need to repeat the test a couple of times to
> >> ensure that the difference is more than noise.  having said that, if
> >> we are seeing an effect, it would be interesting to know what the
> >> latency distribution is like.  is it consistently worse in the 2nd
> >> case or do we see higher spikes at specific times?
> >>
> > I've repeated the tests with similar results. Each test is done with a
> > clean new rbd image, first removing any existing images in the pool and
> > then creating the new image. Between tests I am running:
> >
> >   echo 3 > /proc/sys/vm/drop_caches
> >
> > - In the first test I've created 70 images (60G) and mounted them:
> >
> > /dev/rbd1 on /mnt/fiotest/vtest0
> > /dev/rbd2 on /mnt/fiotest/vtest1
> > ..
> > /dev/rbd70 on /mnt/fiotest/vtest69
> >
> > fio output:
> >
> > rand-write-4k: (groupid=0, jobs=70): err= 0: pid=21852: Tue Jul  8
> > 14:52:56 2014
> >    write: io=2559.5MB, bw=26179KB/s, iops=6542, runt=100116msec
> >      slat (usec): min=18, max=512646, avg=4002.62, stdev=13754.33
> >      clat (usec): min=867, max=579715, avg=37581.64, stdev=55954.19
> >       lat (usec): min=903, max=586022, avg=41957.74, stdev=59276.40
> >      clat percentiles (msec):
> >       |  1.00th=[    5],  5.00th=[   10], 10.00th=[   13],
> > 20.00th=[   18], | 30.00th=[   21], 40.00th=[   26], 50.00th=[   31],
> > 60.00th=[   34], | 70.00th=[   37], 80.00th=[   41], 90.00th=[   48],
> > 95.00th=[   61], | 99.00th=[  404], 99.50th=[  445], 99.90th=[  494],
> > 99.95th=[  515], | 99.99th=[  553]
> >      bw (KB  /s): min=    0, max=  694, per=1.46%, avg=383.29,
> > stdev=148.01 lat (usec) : 1000=0.01%
> >      lat (msec) : 2=0.12%, 4=0.63%, 10=4.82%, 20=22.33%, 50=63.97%
> >      lat (msec) : 100=5.61%, 250=0.47%, 500=2.01%, 750=0.08%
> >    cpu          : usr=0.69%, sys=2.57%, ctx=1525021, majf=0, minf=2405
> >    IO depths    : 1=1.1%, 2=0.6%, 4=335.8%, 8=0.0%, 16=0.0%, 32=0.0%,
> >> =64=0.0%
> >       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >> =64=0.0%
> >       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >> =64=0.0%
> >       issued    : total=r=0/w=655015/d=0, short=r=0/w=0/d=0
> >       latency   : target=0, window=0, percentile=100.00%, depth=4
> >
> > Run status group 0 (all jobs):
> >    WRITE: io=2559.5MB, aggrb=26178KB/s, minb=26178KB/s, maxb=26178KB/s,
> > mint=100116msec, maxt=100116msec
> >
> > Disk stats (read/write):
> >    rbd1: ios=0/2408612, merge=0/979004, ticks=0/39436432,
> > in_queue=39459720, util=99.68%
> >
> > - In the second test I only created one large image (4,2T)
> >
> > /dev/rbd1 on /mnt/fiotest/vtest0 type ext4
> > (rw,noatime,nodiratime,data=ordered)
> >
> > fio output:
> >
> > rand-write-4k: (groupid=0, jobs=70): err= 0: pid=8907: Wed Jul  9
> > 13:38:14 2014
> >    write: io=2264.6MB, bw=23143KB/s, iops=5783, runt=100198msec
> >      slat (usec): min=0, max=3099.8K, avg=4131.91, stdev=21388.98
> >      clat (usec): min=850, max=3133.1K, avg=43337.56, stdev=93830.42
> >       lat (usec): min=930, max=3147.5K, avg=48253.22, stdev=100642.53
> >      clat percentiles (msec):
> >       |  1.00th=[    5],  5.00th=[   11], 10.00th=[   14],
> > 20.00th=[   19], | 30.00th=[   24], 40.00th=[   29], 50.00th=[   33],
> > 60.00th=[   36], | 70.00th=[   39], 80.00th=[   43], 90.00th=[   51],
> > 95.00th=[   68], | 99.00th=[  506], 99.50th=[  553], 99.90th=[  717],
> > 99.95th=[  783], | 99.99th=[ 3130]
> >      bw (KB  /s): min=    0, max=  680, per=1.54%, avg=355.39,
> > stdev=156.10 lat (usec) : 1000=0.01%
> >      lat (msec) : 2=0.12%, 4=0.66%, 10=4.21%, 20=17.82%, 50=66.95%
> >      lat (msec) : 100=7.34%, 250=0.78%, 500=1.10%, 750=0.99%,
> > 1000=0.02% lat (msec) : >=2000=0.04%
> >    cpu          : usr=0.65%, sys=2.45%, ctx=1434322, majf=0, minf=2399
> >    IO depths    : 1=0.2%, 2=0.1%, 4=365.4%, 8=0.0%, 16=0.0%, 32=0.0%,
> >> =64=0.0%
> >       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >> =64=0.0%
> >       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >> =64=0.0%
> >       issued    : total=r=0/w=579510/d=0, short=r=0/w=0/d=0
> >       latency   : target=0, window=0, percentile=100.00%, depth=4
> >
> > Run status group 0 (all jobs):
> >    WRITE: io=2264.6MB, aggrb=23142KB/s, minb=23142KB/s, maxb=23142KB/s,
> > mint=100198msec, maxt=100198msec
> >
> > Disk stats (read/write):
> >    rbd1: ios=0/2295106, merge=0/926648, ticks=0/39660664,
> > in_queue=39706288, util=99.80%
> >
> >
> >
> > It seems that latency is more stable in the first case.
> 
> So I guess what comes to mind is when you have all of the fio processes 
> writing to files on a single file system there's now another whole layer 
> of locks and contention.  Not sure how likely this is though.  Josh 
> might be able to chime in if there's something on the RBD side that 
> could slow this kind of use case down.
> 
> >
> >
> >> In case 3, do you have multiple fio jobs going or just 1?
> > In all three cases, I am using one fio process with NUMJOBS=70
> 
> Is RBD cache enabled?  It's interesting that librbd is so much slower in 
> this case than kernel RBD for you.  If anything I would have expected 
> the opposite.
> 
Come again?
User space RBD with the default values will have little to no impact in
this scenario.

Whereas kernel space RBD will be able to use every last byte of memory for
page cache, totally ousting users pace RBD.

Regards,

Christian

> >>
> >>>
> >>>
> >>> thanks in advance or any help,
> >>> Xabier
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users at lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users at lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux