Re: gluster 3.4 self-heal

Ivano Talamo <Ivano.Talamo@xxxxxxxxxxxxx> · Wed, 28 May 2014 09:19:12 +0200



    Hi Ravishankar,

      
      thank you for the explanation.

      I expected a performance hit after such a long shutdown, the only
      problem is I couldn't understand

      if the healing was going or not.

      After launching the gluster volume heal vol1 full I can see files
      in the .glusterfs/indices/xattrop/ directory

      to decrease, but to this rate it would take two weeks to finish,
      maybe I would rather delete and recreate the volume

      from scratch and with 3.5.

      
      Thanks

      Ivano

      
      On 5/27/14 7:35 PM, Ravishankar N wrote:

    
      On 05/27/2014 08:47 PM, Ivano Talamo
        wrote:

      
      Dear

        all, 

        
        we have a replicated volume (2 servers with 1 brick each) on
        Scientific Linux 6.2 with gluster 3.4. 

        Everything was running fine until we shutdown of of the two and
        kept it down for 2 months. 

        When it came up again the volume could be healed and we have the
        following symptoms 

        (call #1 the always-up server, #2 the server that was kept
        down): 

        
        -doing I/O on the volume has very bad performances (impossible
        to keep VM images on it) 

        
      A replica's bricks are not supposed to be intentionally kept down
      even for hours, leave alone months 
          :-(  ; If you do; then when it does come
      backup, there would be tons of stuff to heal, so a performance hit
      is expected.

      -on

        #1 there's 3997354 files on .glusterfs/indices/xattrop/ and the
        number doesn't go down 

        
      When #2 was down, did the I/O involve directory renames? (see if
      there are entries on .glusterfs/landfill on #2). If yes then this
      is a known issue and a fix is in progress : http://review.gluster.org/#/c/7879/

      
      -on

        #1 gluster volume heal vol1 info the first time takes a lot to
        end and doesn't show nothing. 

      
      This is fixed in glusterfs 3.5  where heal info is much more
      responsive.

      after

        that it prints "Another transaction is in progress. Please try
        again after sometime." 

        
        Furthermore on #1 glusterhd.log is full of messages like this: 

        [2014-05-27 15:07:44.145326] W
        [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-vol1-client-0:
        remote operation failed: No such file or directory 

        [2014-05-27 15:07:44.145880] W
        [client-rpc-fops.c:1640:client3_3_entrylk_cbk] 0-vol1-client-0:
        remote operation failed: No such file or directory 

        [2014-05-27 15:07:44.146070] E
        [afr-self-heal-entry.c:2296:afr_sh_post_nonblocking_entry_cbk]
        0-vol1-replicate-0: Non Blocking entrylks failed for
        <gfid:bfbe65db-7426-4ca0-bf0b-7d1a28de2052>. 

        [2014-05-27 15:13:34.772856] E
        [afr-self-heal-data.c:1270:afr_sh_data_open_cbk]
        0-vol1-replicate-0: open of
        <gfid:18a358e0-23d3-4f56-8d74-f5cc38a0d0ea> failed on
        child vol1-client-0 (No such file or directory) 

        
        On #2 bricks I see some updates, ie. new filenames appearing and
        .glusterfs/indices/xattrop/ is usually empy. 

        
        Do you know what's happening? How can we fix this? 

      
      You could try a `gluster volume heal vol1 full` to see if the
      bricks get synced.

      
      Regards,

      Ravi

      
        thank you, 

        Ivano 

        
        _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
      
      
Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users