Re: Unnecessary healing in 3-node replication setup on reboot

Udo Giacomozzi <udo.giacomozzi@xxxxxxxxxx> · Sat, 17 Oct 2015 17:38:17 +0200



    Am 16.10.2015 um 18:51 schrieb Vijay
      Bellur:

    
      self-healing in gluster by default syncs only modified parts of
      the files from a source node. Gluster does a rolling checksum of a
      file needing self-heal to identify regions of the file which need
      to be synced over the network. This rolling checksum computation
      can sometimes be expensive and there are plans to have a lighter
      self-healing in 3.8 with more granular changelogs that can do away
      with the need to do a rolling checksum.
      

    I did some tests (see below) - could you please check this and tell
    me if this is normal?

    
    For example, I have a 200GB VM disk image in my volume (the biggest
    file). About 75% of that disk is currently unused space and writes
    are only about 50 kbytes/sec. 

    That 200GB disk image always "heals" a very long time (at
    least 30 minutes) - even if I'm pretty sure only a few blocks could
    have been changed.

    
    Anyway, I just rebooted a node (about 2-3 minutes downtime) to
    collect some information:

    
      In total I have about 790GB* files in that Gluster volume 

      
      about 411GB* belong to active VM HDD images, the remaining are
        backup/template files
      only VM HDD images are being healed (max 15 files)
      while healing, glusterfsd shows varying CPU usages
        between 70% and 650% (it's a 16 cores server); total 106 minutes
        CPU time once healing completed

      
      once healing completes, the machine received a total of 7.0 GB
        and sent 3.6 GB over the internal network (so, yes, you're right
        that not all contents are transferred)
      total heal time: whopping 58 minutes

      
    * these are summed up file sizes; "du" and "df" commands show
      smaller usage

      
    Node details (all 3 nodes are identical):

    
      DELL PowerEdge R730
      Intel Xeon E5-2600 @ 2.4GHz
      64 GB DDR4 RAM
      the server is able to gzip-compress about 1 GB data / second
        (all cores together)

      
      3 TB HW-RAID10 HDD  (2.7TB reserved for Gluster); minimum 500
        MB/s write speed, 350 MB/s read speed
      redundant 1GBit/s internal network
      Debian 7 Wheezy / Proxmox 3.4, Kernel 2.6.32, Gluster 3.5.2

      
    Volume setup:

    
     # gluster volume info systems

      
      Volume Name: systems

      Type: Replicate

      Volume ID: b2d72784-4b0e-4f7b-b858-4ec59979a064

      Status: Started

      Number of Bricks: 1 x 3 = 3

      Transport-type: tcp

      Bricks:

      Brick1: metal1:/data/gluster/systems

      Brick2: metal2:/data/gluster/systems

      Brick3: metal3:/data/gluster/systems

      Options Reconfigured:

      cluster.server-quorum-ratio: 51%

    
    Note that `gluster volume heal "systems" info`
      takes 3-10 seconds to complete during heal - I hope that doesn't
      slow down healing since I tend to run that command frequently.

    
    Would you expect these results or is something wrong?

    
    Would upgrading to Gluster 3.6 or 3.7 improve healing performance?

    
    Thanks,

    Udo

    
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users