Re: [Gluster-devel] Query on healing process

ABHISHEK PALIWAL <abhishpaliwal@xxxxxxxxx> · Thu, 3 Mar 2016 16:54:03 +0530

On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:

    Hi,

      On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:

                  Hi Ravi,

                  As I discussed earlier this issue, I investigated this
                  issue and find that healing is not triggered because
                  the "gluster volume heal c_glusterfs info split-brain"
                  command not showing any entries as a outcome of this
                  command even though the file in split brain case.

    Couple of observations from the 'commands_output' file.

    getfattr -d -m . -e hex
    opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

    The afr xattrs do not indicate that the file is in split brain:

    # file:
    opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

    trusted.afr.c_glusterfs-client-1=0x000000000000000000000000

    trusted.afr.dirty=0x000000000000000000000000

    trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9

    trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

    getfattr -d -m . -e hex
    opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

    trusted.afr.c_glusterfs-client-0=0x000000080000000000000000

    trusted.afr.c_glusterfs-client-2=0x000000020000000000000000

    trusted.afr.c_glusterfs-client-4=0x000000020000000000000000

    trusted.afr.c_glusterfs-client-6=0x000000020000000000000000

    trusted.afr.dirty=0x000000000000000000000000

    trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7

    trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

    1. There doesn't seem to be a split-brain going by the trusted.afr*
    xattrs.

if it is not the split brain problem then how can I resolve this.

    2. You seem to have re-used the bricks from another volume/setup.
    For replica 2, only trusted.afr.c_glusterfs-client-0 and
    trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs
    - client-0,2,4 and 6 

could you please suggest why these entries are there because I am not able to find out scenario. I am rebooting the one board multiple times to reproduce the issue and after every reboot doing the remove-brick and add-brick on the same volume for the second board.

    3. On the rebooted node, do you have ssl enabled by any chance?
    There is a bug for "Not able to fetch volfile' when ssl is enabled:
    https://bugzilla.redhat.com/show_bug.cgi?id=1258931

    Btw, you for data and metadata split-brains you can use the gluster
    CLI 
    https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
    instead of modifying the file from the back end.

But you are saying it is not split brain problem and even the split-brain command  is not showing any file so how can I find the bigger file in size. Also in my case the file size is fix 2MB it is overwritten every time.  

    -Ravi

                So, what I have done I manually deleted the gfid entry
                of that file from .glusterfs directory and follow the
                instruction mentioned in the following link to do heal

                https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md

              and this works fine for me.

            But my question is why the split-brain command not showing
            any file in output.

          Here I am attaching all the log which I get from the node
            for you and also the output of commands from both of the
            boards

          In this tar file two directories are present 

          000300 - log for the board which is running continuously

          002500-  log for the board which is rebooted 

          I am waiting for your reply please help me out on this
            issue.

          Thanks in advanced.

          Regards,

        Abhishek

        On Fri, Feb 26, 2016 at 1:21 PM,
          ABHISHEK PALIWAL <abhishpaliwal@xxxxxxxxx>
          wrote:

                On Fri, Feb 26,
                    2016 at 10:28 AM, Ravishankar N <ravishankar@xxxxxxxxxx>
                    wrote:

                        On 02/26/2016 10:10 AM, ABHISHEK PALIWAL
                          wrote:

                          Yes correct

                        Okay, so when you say the files are not in sync
                        until some time, are you getting stale data when
                        accessing from the mount?

                        I'm not able to figure out why heal info shows
                        zero when the files are not in sync, despite all
                        IO happening from the mounts. Could you provide
                        the output of getfattr -d -m . -e hex
                        /brick/file-name from both bricks when you hit
                        this issue?

                        I'll provide the logs once I get. here
                          delay means we are powering on the second
                          board after the 10 minutes.

                                On Feb 26, 2016
                                  9:57 AM, "Ravishankar N" <ravishankar@xxxxxxxxxx>

                                  wrote:

                                      Hello,

                                        On 02/26/2016 08:29 AM, ABHISHEK
                                        PALIWAL wrote:

                                                      Hi Ravi,

                                                      Thanks for the
                                                      response.

                                                    We are using
                                                    Glugsterfs-3.7.8

                                                    Here is the use
                                                    case:

                                                    We
                                                      have a logging
                                                      file which saves
                                                      logs of the events
                                                      for every board of
                                                      a node and these
                                                      files are in sync
                                                      using glusterfs.
                                                      System in replica
                                                      2 mode it means When
                                                        one brick in a
                                                        replicated
                                                        volume goes
                                                        offline, the
                                                        glusterd daemons
                                                        on the other
                                                        nodes keep track
                                                        of all the files
                                                        that are not
                                                        replicated to
                                                        the offline
                                                        brick. When the
                                                        offline brick
                                                        becomes
                                                        available again,
                                                        the cluster
                                                        initiates a
                                                        healing process,
                                                        replicating the
                                                        updated files to
                                                        that brick. But
                                                      in our casse, we
                                                      see that log file
                                                      of one board is
                                                      not in the sync
                                                      and its format is
                                                      corrupted means
                                                      files are not in
                                                      sync.

                                      Just to understand you correctly,
                                      you have mounted the 2 node
                                      replica-2 volume on both these
                                      nodes and writing to a logging
                                      file from the mounts right? 

                                                  Even the outcome of #gluster
                                                      volume heal
                                                      c_glusterfs info
                                                      shows that there
                                                      is no pending
                                                      heals.

                                                    Also
                                                      , The logging file
                                                      which is updated
                                                      is of fixed size
                                                      and the new
                                                      entries will be
                                                      wrapped
                                                      ,overwriting the
                                                      old entries.

                                                      This way we have
                                                      seen that after
                                                      few restarts , the
                                                      contents of the
                                                      same file on two
                                                      bricks are
                                                      different , but
                                                      the volume heal
                                                      info shows zero
                                                      entries

                                                Solution:

                                              But when we
                                                  tried to put delay 
                                                      > 5 min
                                                  before the healing
                                                  everything is working
                                                  fine.

                                            Regards,

                                          Abhishek

                                          On
                                            Fri, Feb 26, 2016 at 6:35
                                            AM, Ravishankar N <ravishankar@xxxxxxxxxx>
                                            wrote:

                                                  On 02/25/2016
                                                    06:01 PM, ABHISHEK
                                                    PALIWAL wrote:

                                                          Hi,

                                                          Here, I have
                                                          one query
                                                          regarding the
                                                          time taken by
                                                          the healing
                                                          process.

                                                          In current two
                                                          node setup
                                                          when we
                                                          rebooted one
                                                          node then the
                                                          self-healing
                                                          process starts
                                                          less than 5min
                                                          interval on
                                                          the board
                                                          which
                                                          resulting the
                                                          corruption of
                                                          the some files
                                                          data.

                                                 Heal should
                                                start immediately after
                                                the brick process comes
                                                up. What version of
                                                gluster are you using?
                                                What do you mean by
                                                corruption of data?
                                                Also, how did you
                                                observe that the heal
                                                started after 5 minutes?

                                                -Ravi

                                                        And to resolve
                                                        it I have search
                                                        on google and
                                                        found the
                                                        following link:

                                                        https://support.rackspace.com/how-to/glusterfs-troubleshooting/

                                                      Mentioning
                                                        that the healing
                                                        process can
                                                        takes upto 10min
                                                        of time to start
                                                        this process.

                                                      Here is the
                                                        statement from
                                                        the link:

                                                        "Healing
                                                        replicated
                                                        volumes 

                                                        When any brick
                                                        in a replicated
                                                        volume goes
                                                        offline, the
                                                        glusterd daemons
                                                        on the remaining
                                                        nodes keep track
                                                        of all the files
                                                        that are not
                                                        replicated to
                                                        the offline
                                                        brick. When the
                                                        offline brick
                                                        becomes
                                                        available again,
                                                        the cluster
                                                        initiates a
                                                        healing process,
                                                        replicating the
                                                        updated files to
                                                        that brick. The

                                                          start of this
                                                          process can
                                                          take up to 10
                                                          minutes, based
                                                          on
                                                          observation."

                                                      After giving
                                                        the time of more
                                                        than 5 min file
                                                        corruption
                                                        problem has been
                                                        resolved.

                                                      So, Here my
                                                        question is
                                                        there any way
                                                        through which we
                                                        can reduce the
                                                        time taken by
                                                        the healing
                                                        process to
                                                        start?

                                                      Regards,

                                                      Abhishek Paliwal

                                                  _______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

                                          -- 

                                              Regards

                                              Abhishek Paliwal

                    -- 

                        Regards

                        Abhishek Paliwal

        -- 

            Regards

            Abhishek Paliwal

-- 

Regards

Abhishek Paliwal

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users