With FIO, raw write speed to EBS volume is like this: test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=sync, iodepth=8 Starting 1 process Jobs: 1 (f=1): [W] [100.0% done] [0K/43124K /s] [0 /329 iops] [eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=6406 write: io=1024.0MB, bw=37118KB/s, iops=289 , runt= 28250msec clat (usec): min=58 , max=2222 , avg=78.20, stdev=25.17 lat (usec): min=59 , max=2223 , avg=78.89, stdev=25.19 bw (KB/s) : min= 7828, max=60416, per=104.72%, avg=38870.65, stdev=10659.43 Average bandwidth 38.8 MB/s and average completion latency for IO request 78 microsecs. Since FUSE module uses the same blocksize (128 KB) as I set up to use in FIO, I would say the bandwidth for 2x2 replica could be around 15 MB/s and more, when 1 client wants to write 1GB file. Currently, Gluster can go up to 22 MB/s - without replication, with 1 client. But with distributed replica 2x2 on 4 machines, the number for 1 client writing 1 GB file goes down to 6.5 MB/s - that's the thing I don't understand. > I also suggest calculating network latency. I measured individual latencies to server machines here: dfs01: 402 microseconds dfs02: 322 microseconds dfs03: 445 microseconds dfs04: 378 microseconds I guess you mean some other - cumulative latency of a set of nodes? In that case, how do I calculate it? Karol On Wed, Mar 23, 2011 at 5:56 PM, Mohit Anchlia <mohitanchlia at gmail.com> wrote: > What were you really expecting the numbers to be? What no. do you get > when you write directly to the ext3 file system bypassing GFS? > > I also suggest calculating network latency. > > On Wed, Mar 23, 2011 at 4:17 AM, karol skocik <karol.skocik at gmail.com> wrote: >> I see my email to the list was truncated - sending it again. >> >> Hi, >> ?here are the measurements - the client machine is KS, and server >> machines are DFS0[1-4]. >> First, the setup now is: >> >> Volume Name: EBSOne >> Type: Distribute >> Status: Started >> Number of Bricks: 1 >> Transport-type: tcp >> Bricks: >> Brick1: dfs01:/mnt/ebs >> >> With just one client machine writing 1GB file to EBSOne, averaged from 3 runs: >> >> Bandwidth (mean): 22441.84 KB/s >> Bandwidth (deviation): 6059.24 KB/s >> Completion latency (mean): 1274.47 KB/s >> Completion latency (deviation): 1814.58 KB/s >> >> Now, the latencies: >> >> From KS (client machine) to DFS (server machines), averages of 3 runs. >> >> Latencies: >> dfs01: 402 microseconds >> dfs02: 322 microseconds >> dfs03: 445 microseconds >> dfs04: 378 microseconds >> >> Bandwidths: >> dfs01: 54 MB/s >> dfs02: 62.5 MB/s >> dfs03: 64 MB/s >> dfs04: 91.5 MB/s >> >> Every server machine has just 1 EBS drive, ext3 filesystem, >> 2.6.18-xenU-ec2-v1.0 - CFQ IO scheduler. >> >> Any ideas? From the numbers above - does it have any sense to try to >> make sw RAID0 with mdadm, or eventually use another filesystem? >> >> Thank you for help. >> Regards Karol >> >> On Wed, Mar 23, 2011 at 11:31 AM, karol skocik <karol.skocik at gmail.com> wrote: >>> Hi, >>> ?here are the measurements - the client machine is KS, and server >>> machines are DFS0[1-4]. >>> First, the setup now is: >>> >>> Volume Name: EBSOne >>> Type: Distribute >>> Status: Started >>> Number of Bricks: 1 >>> Transport-type: tcp >>> Bricks: >>> Brick1: dfs01:/mnt/ebs >>> >>> With just one client machine writing 1GB file to EBSOne, averaged from 3 runs: >>> >>> Bandwidth (mean): 22441.84 KB/s >>> Bandwidth (deviation): 6059.24 KB/s >>> Completion latency (mean): 1274.47 KB/s >>> Completion latency (deviation): 1814.58 KB/s >>> >>> Now, the latencies: >>> >>> From KS (client machine) to DFS (server machines), averages of 3 runs. >>> >>> Latencies: >>> dfs01: 402 microseconds >>> dfs02: 322 microseconds >>> dfs03: 445 microseconds >>> dfs04: 378 microseconds >>> >>> Bandwidths: >>> dfs01: 54 MB/s >>> dfs02: 62.5 MB/s >>> dfs03: 64 MB/s >>> dfs04: 91.5 MB/s >>> >>> Every server machine has just 1 EBS drive, ext3 filesystem, >>> 2.6.18-xenU-ec2-v1.0 - CFQ IO scheduler. >>> >>> Any ideas? From the numbers above - does it have any sense to try to >>> make sw RAID0 with mdadm, or eventually use another filesystem? >>> >>> Thank you for help. >>> Regards Karol >>> >>> On Tue, Mar 22, 2011 at 6:08 PM, Mohit Anchlia <mohitanchlia at gmail.com> wrote: >>>> Can you first run some test with no replica and see what results you >>>> get? Also, can you look at network latency from client to each of your >>>> 4 servers and post the results? >>>> >>>> On Mon, Mar 21, 2011 at 1:27 AM, karol skocik <karol.skocik at gmail.com> wrote: >>>>> Hi, >>>>> ?I am in the process of evaluation of Gluster for major BI company, >>>>> but I was surprised by very small write performance on Amazon EBS. >>>>> Our setup is Gluster 3.1.2, distributed replica 2x2 on 64-bit m1.large >>>>> instances. Every server node has 1 EBS volume attached to it. >>>>> The configuration of the distributed replica is a default one, my >>>>> small attemps to improve performance (io-threads, disabled io-stats >>>>> and latency-measurement): >>>>> >>>>> volume EBSVolume-posix >>>>> ? ?type storage/posix >>>>> ? ?option directory /mnt/ebs >>>>> end-volume >>>>> >>>>> volume EBSVolume-access-control >>>>> ? ?type features/access-control >>>>> ? ?subvolumes EBSVolume-posix >>>>> end-volume >>>>> >>>>> volume EBSVolume-locks >>>>> ? ?type features/locks >>>>> ? ?subvolumes EBSVolume-access-control >>>>> end-volume >>>>> >>>>> volume EBSVolume-io-threads >>>>> ? ?type performance/io-threads >>>>> ? ?option thread-count 4 >>>>> ? ?subvolumes EBSVolume-locks >>>>> end-volume >>>>> >>>>> volume /mnt/ebs >>>>> ? ?type debug/io-stats >>>>> ? ?option log-level NONE >>>>> ? ?option latency-measurement off >>>>> ? ?subvolumes EBSVolume-io-threads >>>>> end-volume >>>>> >>>>> volume EBSVolume-server >>>>> ? ?type protocol/server >>>>> ? ?option transport-type tcp >>>>> ? ?option auth.addr./mnt/ebs.allow * >>>>> ? ?subvolumes /mnt/ebs >>>>> end-volume >>>>> >>>>> In our test, all clients starts writing to different 1GB file at the same time. >>>>> The measured write bandwidth, with 2x2 servers: >>>>> >>>>> 1 client: 6.5 MB/s >>>>> 2 clients: 4.1 MB/s >>>>> 3 clients: 2.4 MB/s >>>>> 4 clients: 4.3 MB/s >>>>> >>>>> This is not acceptable for our needs. With PVFS2 (I know it's >>>>> stripping which is very different from replica) we can get up to 35 >>>>> MB/s. >>>>> 2-3 times slower than that would be understandable. But 5-15 times >>>>> slower is not, and I would like to know whether there is something we >>>>> could try out. >>>>> >>>>> Could anybody publish their write speeds on similar setup, and tips >>>>> how to achieve better performance? >>>>> >>>>> Thank you, >>>>> ?Karol >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >>>>> >>>> >>> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >