Re: Self Heal Confusion

Brett Holcomb <biholcomb@xxxxxxxxxx> · Sat, 22 Dec 2018 23:17:32 -0500



    Very strange.  I see this in the glusterd.log
    [2018-12-22 23:53:47.216743] E [MSGID: 101191]
      [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to
      dispatch handler

      (END)
    After force starting the volume and doing a 

    
    gluster vol  heal projects full
    This is in the glustershd log so I assume it started.

    
    [2018-12-22 22:54:22.328897] I [MSGID: 114046]
      [client-handshake.c:1107:client_setvolume_cbk]
      0-projects-client-5: Connected to projects-client-5, attached to
      remote volume '/srv/gfs01/Projects'.
    This shows up in the glfsheal-projects.log file.
    [2018-12-22 23:53:41.916773] E [MSGID: 101191]
      [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to
      dispatch handler
    I'm not sure what it's trying to tell me when it fails to
      dispatch a handler.
    From what I could find there were issues in the early 5.0 build
      with some of these errors coming up but that a patch was included
      early on.  I am on 5.2
    I'll keep digging.

    
    On 12/20/18 8:26 PM, John Strunk wrote:

    
      Assuming your bricks are up... yes, the heal count
        should be decreasing.
        

        There is/was a bug wherein self-heal would stop healing but
          would still be running. I don't know whether your version is
          affected, but the remedy is to just restart the self-heal
          daemon.
        Force start one of the volumes that has heals pending. The
          bricks are already running, but it will cause shd to restart
          and, assuming this is the problem, healing should begin...
        

        $ gluster vol start my-pending-heal-vol force
        

        Others could better comment on the status of the bug.
        

        -John
        

        On Thu, Dec 20, 2018 at 5:45 PM Brett Holcomb
          <biholcomb@xxxxxxxxxx> wrote:

        
        I
          have one volume that has 85 pending entries in healing and two
          more 

          volumes with 58,854 entries in healing pending.  These numbers
          are from 

          the volume heal info summary command.  They have stayed
          constant for two 

          days now.  I've read the gluster docs and many more.  The
          Gluster docs 

          just give some commands and non gluster docs basically repeat
          that.  

          Given that it appears no self-healing is going on for my
          volume I am 

          confused as to why.

          
          1.  If a self-heal deamon is listed on a host (all of mine
          show one with 

          a volume status command) can I assume it's enabled and
          running?

          
          2.  I assume the volume that has all the self-heals pending
          has some 

          serious issues even though I can access the files and
          directories on 

          it.  If self-heal is running shouldn't the numbers be
          decreasing?

          
          It appears to me self-heal is not working properly so how to I
          get it to 

          start working or should I delete the volume and start over?

          
          I'm running gluster 5.2 on Centos 7 latest and updated.

          
          Thank you.

          
          _______________________________________________

          Gluster-users mailing list

          Gluster-users@xxxxxxxxxxx

          https://lists.gluster.org/mailman/listinfo/gluster-users
      
    
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users