The glusterfs version I'm using is 2.0.6. - Wei On Thu, Sep 10, 2009 at 2:05 PM, Wei Dong <wdong.pku at gmail.com> wrote: > Hi All, > > I complained about the low file creation rate with the glusterfs on my > cluster weeks ago and Avati suggested I started with a small number of > nodes. I finally get sometime to seriously benchmark glusterfs with > Bonnie++ today and the results confirms that glusterfs is indeed slow in > terms of file creating. My application is to store a large number of ~200KB > image files. I use the following bonnie++ command for evaluation (create > 10K files of 200KiB each scattered under 100 directories): > > bonnie++ -d . -s 0 -n 10:200000:200000:100 > > Since sequential I/O is not that interesting to me, I only keep the random > I/O results. > > My hardware configuration is 2xquadcore Xeon E5430 2.66GHz, 16GB memory, 4 > x Seagate 1500GiB 7200RPM hard drive. The machines are connected with > gigabit ethernet. > > I ran several GlusterFS configurations, each named as N-R-T, where N is the > number of replicated volumes aggregated, R is the number of replications and > T is number of server side I/O thread. I use one machine to serve one > volume so there are NxR servers and one separate client running for each > experiment. On the client side, the server volumes are first replicated and > then aggregated -- even with 1-1-2 configuration, the single volume is > wrapped by a replicate and a distribute translator. To show the overhead of > those translators, I also run a "simple" configuration which is 1-1-2 > without the extra replicate & distribute translators, and a "local" > configuration which is "simple" with client & server running on the same > machine. These configurations are compared to "nfs" and "nfs-local", which > is NFS with server and client on the same machine. The GlusterFS volume > file templates are attached to the email. > > The result is at http://www.cs.princeton.edu/~wdong/gluster/summary.gif<http://www.cs.princeton.edu/%7Ewdong/gluster/summary.gif>. The bars/numbers shown are operations/second, so the larger the better. > > Following are the messages shown by the figure: > 1. GlusterFS is doing a exceptionally good job on deleting files, but > creates and reads files much slower than both NFS. > 2. At least for one node server configuration, network doesn't affects the > file creation rate and does affects file read rate. > 3. The extra dummy replicate & distribute translators lowers file creation > rate by almost half. 4. Replication doesn't hurt performance a lot. > 5. I'm running only single-threaded benchmark, so it's hard to say about > scalability, but adding more servers does helps a little bit even in > single-threaded setting. > > Note that my results are not really that different from > http://gluster.com/community/documentation/index.php/GlusterFS_2.0_I/O_Benchmark_Results, > where the single node configuration file create rate is about 30/second. > > > I see no reason why GlusterFS has to be that slower than NFS in file > creation in single node configuration. I'm wondering if someone here can > help me figure out what's wrong in my configuration or what's wrong in the > GlusterFS implementation. > > - Wei > > Server volume: > > volume posix > type storage/posix > option directory /state/partition1/wdong/gluster > end-volume > > volume lock > type features/locks > subvolumes posix > end-volume > > volume brick > type performance/io-threads > option thread-count 2 > subvolumes lock > end-volume > > volume server > type protocol/server > option transport-type tcp > option auth.addr.brick.allow 192.168.99.* > option transport.socket.listen-port 6999 > subvolumes brick > end-volume > > > Client volume > > volume brick-0-0 > type protocol/client > option transport-type tcp > option remote-host c8-0-0 > option remote-port 6999 > option remote-subvolume brick > end-volume > > volume brick-0-1 ... > > volume rep-0 > type cluster/replicate > subvolumes brick-0-0 brick-0-1 ... > > ... > volume union > type cluster/distribute > subvolumes rep-0 rep-1 rep-2 rep-3 rep-4 rep-5 rep-6 rep-7 > end-volume > > volume client > type performance/write-behind > option cache-size 32MB > option flush-behind on > subvolumes union > end-volume > > > For those who are interested enough to see the real configuration files, I > have all the configuration files and server/client logs uploaded to > http://www.cs.princeton.edu/~wdong/gluster/run.tar.gz<http://www.cs.princeton.edu/%7Ewdong/gluster/run.tar.gz>. > >