Your reply makes all sense to me. I remember that auto-heal happens at file reading; doest that mean opening a file for read is also a global operation? Do you mean that there's no other way of copying 30 million files to our 66-node glusterfs cluster for parallel processing other than waiting for half a month? Can I somehow disable self-heal and get a seedup? Things turn out to be too bad for me. - Wei Mark Mielke wrote: > On 09/28/2009 10:35 AM, Wei Dong wrote: >> Hi All, >> >> I noticed a very weird phenomenon when I'm copying data (200KB image >> files) to our glusterfs storage. When I run only run client, it >> copies roughly 20 files per second and as soon as I start a second >> client on another machine, the copy rate of the first client >> immediately degrade to 5 files per second. When I stop the second >> client, the first client will immediately speed up to the original 20 >> files per second. When I run 15 clients, the aggregate throughput is >> about 8 files per second, much worse than running only one client. >> Neither CPU nor network is saturated. My volume file is attached. >> The servers are running on a 66 node cluster and the clients are a >> 15-node cluster. >> >> We have 33x2 servers and at most 15 separate machines, with each >> server serving < 0.5 clients on average. I cannot think of a reason >> for a distributed system to behave like this. There must be some >> kind of central access point. > > Although there is probably room for the GlusterFS folk to optimize... > > You should consider directory write operations to involve the whole > cluster. Creating a file is a directory write operation. Think of how > it might have to do self-heal across the cluster, make sure the name > is right and not already in use across the cluster, and such things. > > Once you get to reads and writes for a particular file, it should be > distributed. > > Cheers, > mark >