I ran some dd test locally and see a big diferrence when using odirect. I have 10k rpm SAS drive and claim to have seek time of 3ms. So I don't understand this behaviour. dd if=/dev/zero of=/root/junk bs=128k count=8000 oflag=direct 8000+0 records in 8000+0 records out 1048576000 bytes (1.0 GB) copied, 58.8426 seconds, 17.8 MB/s dd if=/dev/zero of=/root/junk bs=128k count=8000 8000+0 records in 8000+0 records out 1048576000 bytes (1.0 GB) copied, 1.22749 seconds, 854 MB/s Can dev team look at my numbers with given config and also Karols data? I expect much higher rate. gluster seriously lacks in documents and completely leaves everyone in confused state :) Not sure how to deal with that since there is no commercial support unless you use vmware 4.1. So I guess if it doesn't work then look for some other technology. But all the claims I see about performance makes me feel if we had little better info on performance tuning would help tremenduously. On Fri, Mar 25, 2011 at 2:16 PM, Mohit Anchlia <mohitanchlia at gmail.com> wrote: > It will be good for dev team to look at it in parallel. It will help others too. > > First thing that I see is that your network bandwidth sucks. Is it > 1GigE? When you run tools like iperf you atleast expect to see close > to 800MB/s. for eg: in my env if I run iperf I get something like: > > ------------------------------------------------------------ > TCP window size: 16.0 KByte (default) > ------------------------------------------------------------ > [ ?6] local 10.1.101.193 port 49503 connected with 10.1.101.149 port 5001 > [ ?4] ?0.0-10.0 sec ? ?975 MBytes ? ?815 Mbits/sec > [ ?5] local 10.1.101.193 port 5001 connected with 10.1.101.149 port 41642 > > Can you also try another dd test directly on the gluster server where > volume is and post the results? > > Regarding other perf related questions I haven't myself tries those > yet so I think you will need to change one at a time and expirement > with it. But if there is a inherent perf problem with the server and > underlying storage then those may not be that helpful. > > On Thu, Mar 24, 2011 at 3:55 AM, karol skocik <karol.skocik at gmail.com> wrote: >> Hi Vikas, Mohit, >> ?I should disclose our typical use cases: >> We need to read and write files of size several 100s of MBs - the >> ratio of read : write is about 1:1. >> >>> What did you use to calculate latency? >> >> I used http://www.bitmover.com/lmbench they have a tool "lat_tcp". >> >> Numbers below are from lmbench tool "bw_tcp": >> >>> Network bandwidths: >>> dfs01: 54 MB/s >>> dfs02: 62.5 MB/s >>> dfs03: 64 MB/s >>> dfs04: 91.5 MB/s >> >> The setup is Gluster native, no NFS. >> >> About the "Optimizing Gluster" link - I have seen it before, but there >> are several things I don't understand: >> >> 1.) Tuning FUSE to use larger blocksize - when testing PVFS, we >> achieved best performance with bs = 4MB. >> It's hard to understand why it's hardcoded to 128 KB. >> Also I have read somewhere else (referencing FUSE) - that larger >> blocksize doesn't yield more performance. >> I guess when transfering larger amount of data on network with >> significant latency, >> a lot less IO requests should result in higher throughput. (And it's >> cheaper also on EBS). >> >> Are those listed adjustments to FUSE kernel modules still applicable? >> >> 2.) Enabling direct-io mode >> Does this work on current 3.1.2? : >> >> glusterfs --direct-io-mode=write-only -f <spec-file> <mount-point> >> >> also with --direct-io-mode=read-write ? >> >> Of those parameters in "Setting Volume Options", could this one help: >> - performance.write-behind-window-size - increasing 10-20 times? >> >> Now, the raw block device throughput (dd if=/dev/zero >> of=/path/to/ebs/mount bs=128k count=4096 oflag=direct) >> 3 measurements on server machines dfs0[1-4]: >> >> dfs01: 9.0 MB/s, 16.4 MB/s, 18.4 MB/s >> dfs02: 26.0 MB/s, 28.5 MB/s, 13.0 MB/s >> dfs03: 14.4 MB/s, 11.8 MB/s, 32.6 MB/s >> dfs04: 35.5 MB/s, 33.1 MB/s, 31.9 MB/s >> >> This, indeed, varies considerably! >> >> Thanks for help. >> Karol >> >> >> On Wed, Mar 23, 2011 at 7:06 PM, Vikas Gorur <vikas at gluster.com> wrote: >>> Karol, >>> >>> A few general pointers about EBS performance: >>> >>> We've seen throughput to an EBS volume vary considerably. Since EBS is iSCSI underneath, throughput to a volume can fluctuate, and it is also possible that your instance is on degraded hardware that gets very low throughput to the volume. >>> >>> So I would advise you to first gather some data about all your EBS volumes. You can measure throughput to them by doing something like: >>> >>> dd if=/dev/zero of=/path/to/ebs/mount bs=128k count=4096 oflag=direct >>> >>> The "oflag=direct" will give us the raw block device throughput, without the kernel cache in the way. >>> >>> The performance you see on the Gluster mountpoint will be a function of the EBS performance. You might also want to spin up a couple more instances and see their EBS throughput to get an idea of the range of EBS performance. >>> >>> Doing a RAID0 of 4 or 8 EBS volumes using mdadm will also help you increase performance. >>> >>> ------------------------------ >>> Vikas Gorur >>> Engineer - Gluster, Inc. >>> ------------------------------ >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >