[Questions] About small files performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear all

Recently, i did some work to test small files performance for gnfsv3 transport. Following is my scenario.

#####environment#####
==2 cluster nodes(nodeA/nodeB)==
each is equipped with E5-2650*2, 128G memory and 10GB*2 netcard
nodeA: 10.254.3.77  10.128.3.77
nodeB: 10.254.3.78  10.128.3.78

==2 stress nodes(clientA/clientB)==
each is equipped with E5-2650*2, 128G memory and 10GB*2 netcard
clientA: 10.254.3.75
clientB: 10.254.3.76

1) 10.254.3.* is for test segment, 10.128.3.* is for cluster internal communication.

#####vdbench setup#####
hd=default,vdbench=/root/vdbench/,user=root,shell=ssh
#hd=hd1,system=10.254.3.xx
#hd=hd2,system=10.254.3.xx

fsd=fsd1,anchor=/mnt/smalltest1/smalltest/,depth=2,width=100,openflags=o_direct,files=100,size=64k,shared=yes

fwd=format,threads=256,xfersize=xxx
fwd=default,xfersize=xxx,fileio=random,fileselect=random,rdpct=60,threads=256
#fwd=fwd1,fsd=fsd1,host=hd1
#fwd=fwd2,fsd=fsd1,host=hd2

rd=rd1,fwd=fwd*,fwdrate=max,format=restart,elapsed=600,interval=1

1) Use *o_direct* to bypass cache.
2) More than 256 threads show no affect in this test
3) Total 100 millon 64k files

#####volume info#####
Volume Name: ttt
Type: Replicate
Volume ID: cf23b1fe-d430-4ede-b33b-b54a2c04d080
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.128.3.77:/gluster/brick-mm
Brick2: 10.128.3.78:/gluster/brick-mm
Options Reconfigured:
performance.nfs.stat-prefetch: off
performance.nfs.quick-read: off
performance.nfs.io-threads: on
client.event-threads: 32
server.event-threads: 32
features.shard: off
nfs.trusted-sync: on
performance.cache-size: 4000MB
performance.io-thread-count: 64
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off

Note:
1) I put 10.128.3.*:/gluster/brick-mm on tmpfs, so we can ignore io latency.
2) The key values are based on my experience for best peformance
3) The options of mount.nfs are default because 1M 'rsize/wsize' and 'async' are the best choice. I also dig other options, no significant performance difference to me
4) I've set performance.cache-size as 30GB, but it shows no diffrence to me
5) The network bandwidth is not full for all tests
6) I've tried 'nfs.mem-factor' 'rpc.outstanding-rpc-limit', but gained nothing
7) The version of gluster is 3.8.4

Firstly i get some data with kernel nfs for comparison, the export dir (rw,async,no_root_squash,no_all_squash) is also in tmpfs:
[testA]
nfs.client: clientA
nfs.server: nodeA
xfersize=32k
25000ops

[testB]
nfs.client: clientA
nfs.server: nodeA
xfersize=4k
100000ops

The i did the gnfsv3 tests:
[testC]
gnfs.client: clientA(mount nodeA)
gnfs.server: nodeA nodeB
xfersize=32k
10000ops

[testD]
gnfs.client: clientA(mount nodeA) clientB(mount nodeB)
gnfs.server: nodeA nodeB
xfersize=32k
10000ops

For testA vs testB, small xfersize archive plenty of ops, and i got the same result in gnfs.

For testC vs testD, it seems that there is a *bottle neck* with the cluster, 10000ops is limit value to me, am I right? More, i've added more stress nodes and thread counts, but just little affect.

We can also dig something from testA and testC. Event if gnfs is as efficient as kernel nfs, gluster fell 60% ops performance!

Although it's known that gluster is designed for large files. But I'm a little greedy to ask if there is anyway to promote small files performance?

Any idea and/or challenge for tests would be appreciated, thanks in advance ;)

--
Thanks
    -Xie
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux