On 09/26/2011 07:34 AM, Emmanuel Noobadmin wrote: > I've been leaning towards actually deploying gluster in one of my > projects for a while and finally a probable candidate project came up. > > However, researching into the specific use case, it seems that gluster > isn't really suitable for load profiles that deal with lots of > concurrent small files. e.g. > > http://www.techforce.com.br/news/linux_blog/glusterfs_tuning_small_files > http://rackerhacker.com/2010/12/02/keep-web-servers-in-sync-with-drbd-and-ocfs2/ > http://bugs.gluster.com/show_bug.cgi?id=2869 > http://gluster.org/pipermail/gluster-users/2011-June/007970.html > > The first two are rather old so maybe the situation has changed. But > the bug report and mailing list issue in June ring alarm bells. > > Is gluster really unsuited for this kind of workload or have things > improved since then? > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > I guess the question to ask here is, do you need a lot of read/write performance for your application, or is redundancy and synchronisation more important? In my own tests I used rsync to transfer 14TB of data to our new two glusterfs storage nodes. The data was composed of about 500GB of small jpegs and the rest was video files. As you can guess, rsync is not so good with lots of small files, at least not THAT many small files, so with a 10Gigabit ethernet connection, on the small files we got about 10-30 megabytes per second. Once we got to the big files, we managed about 100-150megabytes /per second. Definitely not the maximum the system was capable of, but then again, these weren't ideal testing conditions. A simple dd if=/dev/zero | pv | dd of=/storage/testfile.dmp on a locally mounted glusterfsmount resulted in about 200-250megabytes /s. Of course an iperf between the two nodes resulted in a maximum network speed of around 5 gigabits/s. Of course, regardless of what other people might have experienced. Your best bet ist to test it with your own equipment. There are so many variables between differing distros, kernels, optimisations, and hardware, it's hard to guarantee any kind of minimum performance.