OK, so the previous good results are indeed too good to be true. Here's a more reasonable evaluation http://www.cs.princeton.edu/~wdong/gluster/large.gif where I enlarged the number of images created by 10x so everything no longer fit in main memory. Still look good to me. Wei Dong wrote: > I think it is fuse that causes the slowness. I ran all experiments > with booster enabled and here's the new figure: > http://www.cs.princeton.edu/~wdong/gluster/summary-booster.gif . The > numbers are MUCH better than NFS in most cases except for the local > setting, which is not practically interesting. The interesting thing > is that all of a sudden, the deleting rate drop by 4-10 times -- > though I don't really care about file deletion. > > I must say that I'm totally satisfied by the results. > > - Wei > > > Wei Dong wrote: >> Hi All, >> >> I complained about the low file creation rate with the glusterfs on >> my cluster weeks ago and Avati suggested I started with a small >> number of nodes. I finally get sometime to seriously benchmark >> glusterfs with Bonnie++ today and the results confirms that glusterfs >> is indeed slow in terms of file creating. My application is to store >> a large number of ~200KB image files. I use the following bonnie++ >> command for evaluation (create 10K files of 200KiB each scattered >> under 100 directories): >> >> bonnie++ -d . -s 0 -n 10:200000:200000:100 >> >> Since sequential I/O is not that interesting to me, I only keep the >> random I/O results. >> >> My hardware configuration is 2xquadcore Xeon E5430 2.66GHz, 16GB >> memory, 4 x Seagate 1500GiB 7200RPM hard drive. The machines are >> connected with gigabit ethernet. >> >> I ran several GlusterFS configurations, each named as N-R-T, where N >> is the number of replicated volumes aggregated, R is the number of >> replications and T is number of server side I/O thread. I use one >> machine to serve one volume so there are NxR servers and one separate >> client running for each experiment. On the client side, the server >> volumes are first replicated and then aggregated -- even with 1-1-2 >> configuration, the single volume is wrapped by a replicate and a >> distribute translator. To show the overhead of those translators, I >> also run a "simple" configuration which is 1-1-2 without the extra >> replicate & distribute translators, and a "local" configuration which >> is "simple" with client & server running on the same machine. These >> configurations are compared to "nfs" and "nfs-local", which is NFS >> with server and client on the same machine. The GlusterFS volume >> file templates are attached to the email. >> >> The result is at >> http://www.cs.princeton.edu/~wdong/gluster/summary.gif . The >> bars/numbers shown are operations/second, so the larger the better. >> >> Following are the messages shown by the figure: >> 1. GlusterFS is doing a exceptionally good job on deleting files, >> but creates and reads files much slower than both NFS. >> 2. At least for one node server configuration, network doesn't >> affects the file creation rate and does affects file read rate. >> 3. The extra dummy replicate & distribute translators lowers file >> creation rate by almost half. 4. Replication doesn't hurt >> performance a lot. >> 5. I'm running only single-threaded benchmark, so it's hard to say >> about scalability, but adding more servers does helps a little bit >> even in single-threaded setting. >> >> Note that my results are not really that different from >> http://gluster.com/community/documentation/index.php/GlusterFS_2.0_I/O_Benchmark_Results, >> where the single node configuration file create rate is about 30/second. >> >> I see no reason why GlusterFS has to be that slower than NFS in file >> creation in single node configuration. I'm wondering if someone here >> can help me figure out what's wrong in my configuration or what's >> wrong in the GlusterFS implementation. >> >> - Wei >> >> Server volume: >> >> volume posix >> type storage/posix >> option directory /state/partition1/wdong/gluster >> end-volume >> >> volume lock >> type features/locks >> subvolumes posix >> end-volume >> >> volume brick >> type performance/io-threads >> option thread-count 2 >> subvolumes lock >> end-volume >> >> volume server >> type protocol/server >> option transport-type tcp >> option auth.addr.brick.allow 192.168.99.* >> option transport.socket.listen-port 6999 >> subvolumes brick >> end-volume >> >> >> Client volume >> >> volume brick-0-0 >> type protocol/client >> option transport-type tcp >> option remote-host c8-0-0 >> option remote-port 6999 >> option remote-subvolume brick >> end-volume >> >> volume brick-0-1 ... >> >> volume rep-0 >> type cluster/replicate >> subvolumes brick-0-0 brick-0-1 ... >> >> ... >> volume union >> type cluster/distribute >> subvolumes rep-0 rep-1 rep-2 rep-3 rep-4 rep-5 rep-6 rep-7 >> end-volume >> >> volume client >> type performance/write-behind >> option cache-size 32MB >> option flush-behind on >> subvolumes union >> end-volume >> >> >> For those who are interested enough to see the real configuration >> files, I have all the configuration files and server/client logs >> uploaded to http://www.cs.princeton.edu/~wdong/gluster/run.tar.gz . >> > >