On Fri, Jun 08, 2012 at 12:19:58AM -0400, olav johansen wrote: > # mount -t glusterfs fs1:/data-storage /storage > I've copied over my data to it again and doing a ls several times, > takes ~0.5 seconds: > [@web1 files]# time ls -all|wc -l Like I said before, please also try without the "-l" flags and compare the results. My guess is that ls -al or ls -alR are not representative of the *real* workload you are going to ask of your system (i.e. "scan all the files in this directory, sequentially, and perform a stat() call on each one in turn") - but please contradict me if I'm wrong. However you need to measure how much cost that "-l" is giving you. > Doing the same thing on the raw os files on one node takes 0.021s > [@fs2 files]# time ls -all|wc -l > 1989 > real 0m0.021s > user 0m0.007s > sys 0m0.015s In that case it's probably all coming from cache. If you wanted to test actual disk performance then you would do echo "3" >/proc/sys/vm/drop_caches before each test (on both client and server, if they are different machines). But from what you say, it sounds like you are actually more interested in the cached answers anyway. > Just as crazy reference, on another single server with SSD's (Raid 10) > drives I get: > files# time ls -alR|wc -l > 2260484 > real 0m15.761s > user 0m5.170s > sys 0m7.670s > For the same operation. (this server even have more files...) You are not comparing like-for-like. A replicated volume behaves very differently from a single brick or distributed volume, as explained before. If you compared a two-brick (HD) setup with an identical two-brick (SSD) setup then that would be meaningful. I would expect that if everything is cacheable then you'd get the same results for both. In that case, what you'd show is that the latency for open/stat and heal is the cause of the delay. Like I said before, I expect that adding the "-l" flag to ls is giving you lots of cumulative latency. This means that the server is actually idle for a lot of the time, while it's waiting for the next request. So the server has spare capacity for handling other clients. In other words: if your real workload is actually lots of clients accessing the system concurrently, you'll get a much better total throughput than the simple tests you are doing, which are a single client performing single operations one after the other. > If I added two more bricks to the cluster / replicated, would this > double read speed? Definitely not. The latency would be the same, it's just that some requests would go to bricks A and B, and other requests would go to bricks C and D. The other two bricks would be idle, and would not speed things up. However, if you had concurrent accesses from multiple clients, the extra bricks would give extra capacity so that the total *throughput* would be higher when there are multiple clients active. So I repeat my advice before. If you really want to understand where the performance issues are coming from, these two tests may highlight them: * Compare the same 2-brick replicated volume, using "ls -aR" versus "ls -laR" * Compare a 2-brick replicated volume to a 2-brick distributed volume, using "ls -laR" on both Regards, Brian.