Re: Slow write times to gluster disk

Ravishankar N <ravishankar@xxxxxxxxxx> · Fri, 14 Apr 2017 10:27:22 +0530



    I'm not sure if the version you are
      running (glusterfs 3.7.11 ) works with NFS-Ganesha as the
        link seems to suggest version >=3.8 as a per-requisite.
        Adding Soumya for help. If it is not supported, then you might
        have to go the plain glusterNFS way.

        Regards,

        Ravi

      
      On 04/14/2017 03:48 AM, Pat Haley wrote:

    
      Hi Ravi (and list),

      
      We are planning on testing the NFS route to see what kind of
      speed-up we get.  A little research led us to the following: 

      
      https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/

      
      Is this correct path to take to mount 2 xfs volumes as a single
      gluster file system volume?  If not, what would be a better path?

      
      Pat

      
      On 04/11/2017 12:21 AM, Ravishankar N
        wrote:

      
        On 04/11/2017 12:42 AM, Pat Haley
          wrote:

        
          Hi Ravi,

          
          Thanks for the reply.  And yes, we are using the gluster
          native (fuse) mount.  Since this is not my area of expertise I
          have a few questions (mostly clarifications)

          
          Is a factor of 20 slow-down typical when compare a
          fuse-mounted filesytem versus an NFS-mounted filesystem or
          should we also be looking for additional issues?  (Note the
          first dd test described below was run on the server that hosts
          the file-systems so no network communication was involved).

        
        Though both the gluster bricks and the mounts are on the same
        physical machine in your setup, the I/O still passes through
        different layers of kernel/user-space fuse stack although I
        don't know if 20x slow down on gluster vs NFS share is normal.
        Why don't you try doing a gluster NFS mount on the machine and
        try the dd test and compare it with the gluster fuse mount
        results?

         
          You also mention tweaking " write-behind xlator settings". 
          Would you expect better speed improvements from switching the
          mounting from fuse to gnfs or from tweaking the settings? 
          Also are these mutually exclusive or would the be additional
          benefits from both switching to gfns and tweaking?

        
        You should test these out and find the answers yourself. :-)

        
          My next question is to make sure I'm clear on the comment " if
          the gluster node containing the gnfs server goes down, all
          mounts done using that node will fail".  If you have 2
          servers, each 1 brick in the over-all gluster FS, and one
          server fails, then for gnfs nothing on either server is
          visible to other nodes while under fuse only the files on the
          dead server are not visible.  Is this what you meant?

        
        Yes, for gnfs mounts, all I/O from various mounts go to the gnfs
        server process (on the machine whose IP was used at the time of
        mounting) which then sends the I/O to the brick processes. For
        fuse, the gluster fuse mount itself talks directly to the
        bricks.

         
          Finally, you mention "even for gnfs mounts, you can achieve
          fail-over by using CTDB".  Do you know if CTDB would have any
          performance impact (i.e. in a worst cast scenario could adding
          CTDB to gnfs erase the speed benefits of going to gnfs in the
          first place)?

        
        I don't think it would. You can even achieve load balancing via
        CTDB to use different gnfs servers for different clients. But I
        don't know if this is needed/ helpful in your current setup
        where everything (bricks and clients) seem to be on just one
        server.

        
        -Ravi

         Thanks

          
          Pat

          
          On 04/08/2017 12:58 AM,
            Ravishankar N wrote:

          
            Hi Pat,

              
              I'm assuming you are using gluster native (fuse mount). If
              it helps, you could try mounting it via gluster NFS (gnfs)
              and then see if there is an improvement in speed. Fuse
              mounts are slower than gnfs mounts but you get the benefit
              of avoiding a single point of failure. Unlike fuse mounts,
              if the gluster node containing the gnfs server goes down,
              all mounts done using that node will fail). For fuse
              mounts, you could try tweaking the write-behind xlator
              settings to see if it helps. See the
              performance.write-behind and
              performance.write-behind-window-size options in `gluster
              volume set help`. Of course, even for gnfs mounts, you can
              achieve fail-over by using CTDB.

              
              Thanks,

              Ravi

              
              On 04/08/2017 12:07 AM, Pat Haley wrote:

            
              Hi,

              
              We noticed a dramatic slowness when writing to a gluster
              disk when compared to writing to an NFS disk. Specifically
              when using dd (data duplicator) to write a 4.3 GB file of
              zeros:

              
                on NFS disk (/home): 9.5 Gb/s
                on gluster disk (/gdata): 508 Mb/s

                
              The gluser disk is 2 bricks joined together, no
              replication or anything else. The hardware is (literally)
              the same:

              
                one server with 70 hard disks  and a hardware RAID
                  card.
                4 disks in a RAID-6 group (the NFS disk)
                32 disks in a RAID-6 group (the max allowed by the
                  card, /mnt/brick1)
                32 disks in another RAID-6 group (/mnt/brick2)
                2 hot spare

                
              Some additional information and more tests results
                (after changing the log level):

              
              glusterfs 3.7.11 built on Apr 27 2016 14:09:22

                CentOS release 6.8 (Final)

                RAID bus controller: LSI Logic / Symbios Logic MegaRAID
                SAS-3 3108 [Invader] (rev 02)

                
                Create the file to /gdata (gluster)

                [root@mseas-data2 gdata]# dd if=/dev/zero
                of=/gdata/zero1 bs=1M count=1000

                1000+0 records in

                1000+0 records out

                1048576000 bytes (1.0 GB) copied, 1.91876 s, 546
                  MB/s

                
                Create the file to /home (ext4)

                [root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1
                bs=1M count=1000

                1000+0 records in

                1000+0 records out

                1048576000 bytes (1.0 GB) copied, 0.686021 s, 1.5
                  GB/s - 3 times as fast

                  
                  Copy from /gdata to /gdata (gluster to gluster)

                [root@mseas-data2 gdata]# dd if=/gdata/zero1
                of=/gdata/zero2

                2048000+0 records in

                2048000+0 records out

                1048576000 bytes (1.0 GB) copied, 101.052 s, 10.4
                  MB/s - realllyyy slooowww

                
                Copy from /gdata to /gdata 2nd time (gluster
                    to gluster)

                [root@mseas-data2 gdata]# dd if=/gdata/zero1
                of=/gdata/zero2

                2048000+0 records in

                2048000+0 records out

                1048576000 bytes (1.0 GB) copied, 92.4904 s, 11.3
                  MB/s - realllyyy slooowww again

                
                Copy from /home to /home (ext4 to ext4)

                [root@mseas-data2 gdata]# dd if=/home/zero1
                of=/home/zero2

                2048000+0 records in

                2048000+0 records out

                1048576000 bytes (1.0 GB) copied, 3.53263 s, 297
                  MB/s 30 times as fast

                
                Copy from /home to /home (ext4 to ext4)

                [root@mseas-data2 gdata]# dd if=/home/zero1
                of=/home/zero3

                2048000+0 records in

                2048000+0 records out

                1048576000 bytes (1.0 GB) copied, 4.1737 s, 251 MB/s
                - 30 times as fast

                  
                  As a test, can we copy data directly to the xfs
                  mountpoint (/mnt/brick1) and bypass gluster?

                  
                  Any help you could give us would be appreciated.

                  
                Thanks

              
              -- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

              
              _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
            
            
          -- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

        
      -- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

    
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users