Re: AWS usage in a 3 replicator set with arbiter

Ravishankar N <ravishankar@xxxxxxxxxx> · Wed, 9 Dec 2015 14:18:11 +0530



      On 12/09/2015 12:46 PM, Tim wrote:

    
      Hi List,

      
      I was wondering if anyone has implemented gluster
        successfully in AWS, and has some tips on streamlining the
        process to increase throughput and possibly reduce latency.
        (Sorry in advance if this list has seen this problem a lot)

      
      My current setup is as follows;

      
      gfs-server1 - ap-southeast-2 (AZ1)

      gfs-server2 - ap-southeast-2 (AZ2)

      gfs-server3 - ap-southeast-1 (AZ1) (Arbiter)

      web-server1 - apsoutheast2-az1 (Mounted as gluster/nfs to
        gfs-server1)

      web-server2 - apsoutheast2-az2 (Mounted as gluster/nfs to
        gfs-server2)

      
        Using latest 3.7 package from the Ubuntu launchpad ppa. 

        
      I have one server in each availability zone within
        Australia with the arbiter volume over in Singapore. This will
        hopefully act as a fall back if ever there is a problem
        connecting internally between the two availability zones in the
        same region. Assuming each gluster server can router externally
        and not internally. 

      
    I think when you say volumes, you actually mean bricks. i.e. 2
      bricks of the arbiter volume are in Australia and the 3rd brick in
      Singapore. This is not really recommended. It would be better to
      locate all bricks (and clients too) of a volume in the same region
      (you could still use different availability zones in the same
      region). gluster's replication module winds every write from the
      client to all bricks of the replica. So the closer they are, the
      faster it would be.

      
     This
        is for a webserver with a lot of wordpress + magento
        installations. So it has a lot of files. 

      
      I mounted the gluster volume and started copying across
        the files and it was terribly slow. (See below for data)
        [1]

      
      My Questions are as follows:

      I see from the archives and FAQ's that people have sped
        up copies by using xargs and having multiple threads per sub
        folders. While this is a good idea, is there any other way to
        increase throughput?

      
     Also
        I did a few tests against different mount points on NFS and
        GlusterFS to see what the difference was, and NFS kicks the
        glusterfs mount out of the park. Is there a specific reason for
        this?

    
      For FUSE mounts, the replication happen from the client machine
      while for NFS, it happens from the server which was used for
      mounting the volume. This could be the reason since the client is
      farther away while the servers (2 of them at least) are in the
      same region.

     Would
          removing the arbiter volume or assuming for example sake; that
          there was a third availability zone in ap-southeast-2 so
          latency was not an issue, increase my throughput? As the
          gluster-client has to write the data to the 2 gluster
        volumes and the meta-data to the arbiter would this help in
        reducing the time per file?

      
    You could see if locating all 3 servers and the clients on the
      same region helps improve performance.

      
      Regards,

      Ravi

      
        (Also a non-gluster question that no-one has to answer, has
        anyone tried Amazons' Elastic File System (EFS) and is it
        comparable to gluster?)

        
        Thank you for reading the wall of text, and I appreciate all the
        hard work everyone has put into this great product. 

        
        Cheers,

        Tim

        
        [1] Data:

      
      time cp -Rv wordpress/ /var/gluster-nfs/dir/wordpress/
real    165m4.445s
user    0m0.592s
sys     0m3.227s
      du -shc wordpress/
374M    wordpress/

find wordpress/ | wc -l        
4955
(It works out to be on average 2 seconds per file)

      
      NFS DD Write: 
      sudo dd if=/dev/zero of=./test bs=1024                                                                                                        412738+0 records in
412738+0 records out
422643712 bytes (423 MB) copied, 85.4381 s, 4.9 MB/s
       
      GlusterFS DD Write (1): 
       
      sudo dd if=/dev/zero of=./testgf bs=1024k count=10000
12+0 records in
12+0 records out
12582912 bytes (13 MB) copied, 117.974 s, 107 kB/s
       
      GlusterFS DD Write: (2):
       
      sudo dd if=/dev/zero of=./testgf1 bs=1024 count=10000                                                                                                              10000+0 records in
10000+0 records out
10240000 bytes (10 MB) copied, 56.8728 s, 180 kB/s
      

      _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
    
    
    -- 
Ravishankar N
work: +91 80 3924 5143
extension: 8373143
mobile: +91 96118 43905
irc nick: itisravi

  
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users