performance tests

xelkano@xxxxxxxxxxxx (Xabier Elkano) · Wed, 09 Jul 2014 13:52:42 +0200

El 09/07/14 13:10, Mark Nelson escribi?:
> On 07/09/2014 05:57 AM, Xabier Elkano wrote:
>>
>>
>> Hi,
>>
>> I was doing some tests in my cluster with fio tool, one fio instance
>> with 70 jobs, each job writing 1GB random with 4K block size. I did this
>> test with 3 variations:
>>
>> 1- Creating 70 images, 60GB each, in the pool. Using rbd kernel module,
>> format and mount each image as ext4. Each fio job writing in a separate
>> image/directory. (ioengine=libaio, queue_depth=4, direct=1)
>>
>>     IOPS: 6542
>>     AVG LAT: 41ms
>>
>> 2- Creating 1 large image 4,2TB in the pool. Using rbd kernel module,
>> format and mount the image as ext4. Each fio job writing in a separate
>> file in the same directory. (ioengine=libaio, queue_depth=4,direct=1)
>>
>>    IOPS: 5899
>>    AVG LAT:  47ms
>>
>> 3- Creating 1 large image 4,2TB in the pool. Use ioengine rbd in fio to
>> access the image through librados. (ioengine=rbd,
>> queue_depth=4,direct=1)
>>
>>    IOPS: 2638
>>    AVG LAT: 96ms
>>
>> Do these results make sense? From Ceph perspective, It is better to have
>> many small images than a larger one? What is the best approach to
>> simulate the workload of 70 VMs?
>
> I'm not sure the difference between the first two cases is enough to
> say much yet.  You may need to repeat the test a couple of times to
> ensure that the difference is more than noise.  having said that, if
> we are seeing an effect, it would be interesting to know what the
> latency distribution is like.  is it consistently worse in the 2nd
> case or do we see higher spikes at specific times?
>
I've repeated the tests with similar results. Each test is done with a
clean new rbd image, first removing any existing images in the pool and
then creating the new image. Between tests I am running:

 echo 3 > /proc/sys/vm/drop_caches

- In the first test I've created 70 images (60G) and mounted them:

/dev/rbd1 on /mnt/fiotest/vtest0
/dev/rbd2 on /mnt/fiotest/vtest1
..
/dev/rbd70 on /mnt/fiotest/vtest69

fio output:

rand-write-4k: (groupid=0, jobs=70): err= 0: pid=21852: Tue Jul  8
14:52:56 2014
  write: io=2559.5MB, bw=26179KB/s, iops=6542, runt=100116msec
    slat (usec): min=18, max=512646, avg=4002.62, stdev=13754.33
    clat (usec): min=867, max=579715, avg=37581.64, stdev=55954.19
     lat (usec): min=903, max=586022, avg=41957.74, stdev=59276.40
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[   10], 10.00th=[   13], 20.00th=[   18],
     | 30.00th=[   21], 40.00th=[   26], 50.00th=[   31], 60.00th=[   34],
     | 70.00th=[   37], 80.00th=[   41], 90.00th=[   48], 95.00th=[   61],
     | 99.00th=[  404], 99.50th=[  445], 99.90th=[  494], 99.95th=[  515],
     | 99.99th=[  553]
    bw (KB  /s): min=    0, max=  694, per=1.46%, avg=383.29, stdev=148.01
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.12%, 4=0.63%, 10=4.82%, 20=22.33%, 50=63.97%
    lat (msec) : 100=5.61%, 250=0.47%, 500=2.01%, 750=0.08%
  cpu          : usr=0.69%, sys=2.57%, ctx=1525021, majf=0, minf=2405
  IO depths    : 1=1.1%, 2=0.6%, 4=335.8%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
     issued    : total=r=0/w=655015/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
  WRITE: io=2559.5MB, aggrb=26178KB/s, minb=26178KB/s, maxb=26178KB/s,
mint=100116msec, maxt=100116msec

Disk stats (read/write):
  rbd1: ios=0/2408612, merge=0/979004, ticks=0/39436432,
in_queue=39459720, util=99.68%

- In the second test I only created one large image (4,2T)

/dev/rbd1 on /mnt/fiotest/vtest0 type ext4
(rw,noatime,nodiratime,data=ordered)

fio output:

rand-write-4k: (groupid=0, jobs=70): err= 0: pid=8907: Wed Jul  9
13:38:14 2014
  write: io=2264.6MB, bw=23143KB/s, iops=5783, runt=100198msec
    slat (usec): min=0, max=3099.8K, avg=4131.91, stdev=21388.98
    clat (usec): min=850, max=3133.1K, avg=43337.56, stdev=93830.42
     lat (usec): min=930, max=3147.5K, avg=48253.22, stdev=100642.53
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[   11], 10.00th=[   14], 20.00th=[   19],
     | 30.00th=[   24], 40.00th=[   29], 50.00th=[   33], 60.00th=[   36],
     | 70.00th=[   39], 80.00th=[   43], 90.00th=[   51], 95.00th=[   68],
     | 99.00th=[  506], 99.50th=[  553], 99.90th=[  717], 99.95th=[  783],
     | 99.99th=[ 3130]
    bw (KB  /s): min=    0, max=  680, per=1.54%, avg=355.39, stdev=156.10
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.12%, 4=0.66%, 10=4.21%, 20=17.82%, 50=66.95%
    lat (msec) : 100=7.34%, 250=0.78%, 500=1.10%, 750=0.99%, 1000=0.02%
    lat (msec) : >=2000=0.04%
  cpu          : usr=0.65%, sys=2.45%, ctx=1434322, majf=0, minf=2399
  IO depths    : 1=0.2%, 2=0.1%, 4=365.4%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
     issued    : total=r=0/w=579510/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
  WRITE: io=2264.6MB, aggrb=23142KB/s, minb=23142KB/s, maxb=23142KB/s,
mint=100198msec, maxt=100198msec

Disk stats (read/write):
  rbd1: ios=0/2295106, merge=0/926648, ticks=0/39660664,
in_queue=39706288, util=99.80%

It seems that latency is more stable in the first case.

> In case 3, do you have multiple fio jobs going or just 1?
In all three cases, I am using one fio process with NUMJOBS=70
>
>>
>>
>> thanks in advance or any help,
>> Xabier
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com