Re: [Gluster-users] BoF - Gluster for VM store use case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Just for your reference we got some similar values in a customer setup with three nodes single Xeon and 4x8TB HDD each with a double 10GbE backbone.

We did a simple benchmark with fio tool on a virtual disk (virtio) of a 1TiB of size, XFS formatted directly no partitions no LVM, inside a VM (debian stretch, dual core 4GB RAM) deployed in a gluster volume disperse 3 redundancy 1 distributed 2, sharding enabled.

We run a sequential write test 10GB file in 1024k blocks, a random read test with 4k blocks and a random write test also with 4k blocks several times with results very similar to the following:

writefile: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=200
fio-2.16
Starting 1 process

writefile: (groupid=0, jobs=1): err= 0: pid=11515: Thu Nov  2 16:50:05 2017
  write: io=10240MB, bw=473868KB/s, iops=462, runt= 22128msec
    slat (usec): min=20, max=98830, avg=1972.11, stdev=6612.81
    clat (msec): min=150, max=2979, avg=428.49, stdev=189.96
     lat (msec): min=151, max=2979, avg=430.47, stdev=189.90
    clat percentiles (msec):
     |  1.00th=[  204],  5.00th=[  249], 10.00th=[  273], 20.00th=[  293],
     | 30.00th=[  306], 40.00th=[  318], 50.00th=[  351], 60.00th=[  502],
     | 70.00th=[  545], 80.00th=[  578], 90.00th=[  603], 95.00th=[  627],
     | 99.00th=[  717], 99.50th=[  775], 99.90th=[ 2966], 99.95th=[ 2966],
     | 99.99th=[ 2966]
    lat (msec) : 250=5.09%, 500=54.65%, 750=39.64%, 1000=0.31%, 2000=0.07%
    lat (msec) : >=2000=0.24%
  cpu          : usr=7.81%, sys=1.48%, ctx=1221, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.3%, >=64=99.4%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=0/w=10240/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=200

Run status group 0 (all jobs):
  WRITE: io=10240MB, aggrb=473868KB/s, minb=473868KB/s, maxb=473868KB/s, mint=22128msec, maxt=22128msec

Disk stats (read/write):
  vdg: ios=0/10243, merge=0/0, ticks=0/2745892, in_queue=2745884, util=99.18


benchmark: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
...
fio-2.16
Starting 4 processes

benchmark: (groupid=0, jobs=4): err= 0: pid=11529: Thu Nov  2 16:52:40 2017
  read : io=1123.9MB, bw=38347KB/s, iops=9586, runt= 30011msec
    slat (usec): min=1, max=228886, avg=415.40, stdev=3975.72
    clat (usec): min=482, max=328648, avg=52664.65, stdev=30216.00
     lat (msec): min=9, max=527, avg=53.08, stdev=30.38
    clat percentiles (msec):
     |  1.00th=[   12],  5.00th=[   22], 10.00th=[   23], 20.00th=[   25],
     | 30.00th=[   33], 40.00th=[   38], 50.00th=[   47], 60.00th=[   55],
     | 70.00th=[   64], 80.00th=[   76], 90.00th=[   95], 95.00th=[  111],
     | 99.00th=[  151], 99.50th=[  163], 99.90th=[  192], 99.95th=[  196],
     | 99.99th=[  210]
    lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 10=0.03%, 20=3.59%, 50=52.41%, 100=36.01%, 250=7.96%
    lat (msec) : 500=0.01%
  cpu          : usr=0.29%, sys=1.10%, ctx=10157, majf=0, minf=549
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=287705/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: io=1123.9MB, aggrb=38346KB/s, minb=38346KB/s, maxb=38346KB/s, mint=30011msec, maxt=30011msec

Disk stats (read/write):
  vdg: ios=286499/2, merge=0/0, ticks=3707064/64, in_queue=3708680, util=99.83%

benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128
...
fio-2.16
Starting 4 processes

benchmark: (groupid=0, jobs=4): err= 0: pid=11545: Thu Nov  2 16:55:54 2017
  write: io=422464KB, bw=14079KB/s, iops=3519, runt= 30006msec
    slat (usec): min=1, max=230620, avg=1130.75, stdev=6744.31
    clat (usec): min=643, max=540987, avg=143999.57, stdev=66693.45
     lat (msec): min=8, max=541, avg=145.13, stdev=67.01
    clat percentiles (msec):
     |  1.00th=[   34],  5.00th=[   75], 10.00th=[   87], 20.00th=[  100],
     | 30.00th=[  109], 40.00th=[  116], 50.00th=[  123], 60.00th=[  135],
     | 70.00th=[  151], 80.00th=[  182], 90.00th=[  241], 95.00th=[  289],
     | 99.00th=[  359], 99.50th=[  416], 99.90th=[  465], 99.95th=[  490],
     | 99.99th=[  529]
    lat (usec) : 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.05%, 50=1.80%
    lat (msec) : 100=18.07%, 250=71.25%, 500=8.80%, 750=0.02%
  cpu          : usr=0.29%, sys=1.28%, ctx=115493, majf=0, minf=33
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.8%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued    : total=r=0/w=105616/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
  WRITE: io=422464KB, aggrb=14079KB/s, minb=14079KB/s, maxb=14079KB/s, mint=30006msec, maxt=30006msec

Disk stats (read/write):
  vdg: ios=0/105235, merge=0/0, ticks=0/3727048, in_queue=3734796, util=99.81%


Basically we got sequential write around 470MBps, random read 4k 9500IOPS and random write 4k 3500IOPS.

Hope it helps!


El 01/11/17 a les 12:03, Shyam Ranganathan ha escrit:
On 10/31/2017 08:36 PM, Ben Turner wrote:
* Erasure coded volumes with sharding - seen as a good fit for VM disk
storage
I am working on this with a customer, we have been able to do 400-500 MB / sec writes!  Normally things max out at ~150-250.  The trick is to use multiple files, create the lvm stack and use native LVM striping.  We have found that 4-6 files seems to give the best perf on our setup.  I don't think we are using sharding on the EC vols, just multiple files and LVM striping.  Sharding may be able to avoid the LVM striping, but I bet dollars to doughnuts you won't see this level of perf:)   I am working on a blog post for RHHI and RHEV + RHS performance where I am able to in some cases get 2x+ the performance out of VMs / VM storage.  I'd be happy to share my data / findings.


Ben, we would like to hear more, so please do share your thoughts further. There are a fair number of users in the community who have this use-case and may have some interesting questions around the proposed method.

Shyam
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux