On Wed, Mar 14, 2012 at 11:09:28PM -0500, D. Dante Lorenso wrote: > get 50-60 MB/s transfer speeds tops when sending large files (> 2GB) > to gluster. When copying a directory of small files, we get <= 1 > MB/s performance! > > My question is ... is this right? Is this what I should expect from > Gluster, or is there something we did wrong? We aren't using super > expensive equipment, granted, but I was really hoping for better > performance than this given that raw drive speeds using dd show that > we can write at 125+ MB/s to each "brick" 2TB disk. Unfortunately I don't have any experience with replicated volumes, but the raw glusterfs protocol is very fast: a single brick which is a 12-disk raid0 stripe can give 500MB/sec easily over 10G ethernet without any tuning. I would expect a distributed volume to work fine too, as it just sends each request to one of N nodes. Striped volumes are unfortunately broken on top of XFS at the moment: http://oss.sgi.com/archives/xfs/2012-03/msg00161.html Replicated volumes, from what I've read, need to touch both servers even for read operations (for the self-healing functionality), and that could be a major bottleneck. But there are a few basic things to check: (1) Are you using XFS for the underlying filesystems? If so, did you mount them with the "inode64" mount option? Without this, XFS performance sucks really badly for filesystems >1TB Without inode64, even untarring files into a single directory will make XFS distribute them between AGs, rather than allocating contiguous space for them. This is a major trip-up and there is currently talk of changing the default to be inode64. (2) I have this in /etc/rc.local: for i in /sys/block/sd*/bdi/read_ahead_kb; do echo 1024 >"$i"; done for i in /sys/block/sd*/queue/max_sectors_kb; do echo 1024 >"$i"; done > If I can't get gluster to work, our fail-back plan is to convert > these 8 servers into iSCSI targets and mount the storage onto a > Win2008 head and continue sharing to the network as before. > Personally, I would rather us continue moving toward CentOS 6.2 with > Samba and Gluster, but I can't justify the change unless I can > deliver the performance. Optimising replicated volumes I can't help with. However if you make a simple RAID10 array on each server, and then join the servers into a distributed gluster volume, I think it will rock. What you lose is the high-availability, i.e. if one server fails a proportion of your data becomes unavailable until you fix it - but that's no worse than your iSCSI proposal (unless you are doing something complex, like drbd replication between pairs of nodes and HA failover of the iSCSI target) BTW, Linux md RAID10 with 'far' layout is really cool; for reads it performs like a RAID0 stripe, and it reduces head seeking for random access. Regards, Brian.