Re: split-brain recovery automation, any plans?

Darrell Budic <budic@xxxxxxxxxxxxxxxx> · Tue, 12 Jul 2016 15:20:00 -0500

FYI, it’s my experience that “yum upgrade” will stop the running glistered (and possibly the running glusterfsds) during it’s installation of new gluster components. I’ve also noticed it starts them back up again during the process.
Ie, yesterday I upgraded a system to 3.7.13:

systemctl stop glusterd
<manually kill all glusterfsds>
yum upgrade

and discovered that glusterd was running again, and had started during the yum upgrade processing. All it’s glusterfsds had also started. Somewhat annoying, actually, because I had been planning to reboot the server to switch to the latest kernel as part of the process, but really didn’t feel like interrupting the heals at that time.

This probably didn’t have much impact on you, but it would have restarted any healing that was a result of the upgrade downtime twice. You may have caused yourself some extra wait for the first round of healing to conclude if stuff was using those volumes at the time. If you didn’t wait for those to conclude before starting your next upgrade, you could have caused a split brain on affected active files.

  -Darrell

On Jul 12, 2016, at 10:57 AM, Dmitry Melekhov <dm@xxxxxxxxxx> wrote:

    12.07.2016 17:38, Pranith Kumar
      Karampuri пишет:

      Did you wait for heals to complete  before
        upgrading second node?

    no...

        On Tue, Jul 12, 2016 at 3:08 PM, Dmitry
          Melekhov <dm@xxxxxxxxxx>
          wrote:

              12.07.2016 13:31, Pranith Kumar Karampuri пишет:

                      On Mon, Jul 11, 2016 at
                        2:26 PM, Dmitry Melekhov <dm@xxxxxxxxxx>
                        wrote:

                        11.07.2016 12:47,
                          Gandalf Corvotempesta пишет:

                             2016-07-11
                              9:54 GMT+02:00 Dmitry Melekhov <dm@xxxxxxxxxx>:

                               We just
                                got split-brain during update to 3.7.13
                                ;-)

                              This is an interesting point.

                              Could you please tell me which replica
                              count did you set ?

                           3

                              With replica "3" split brain should not
                              occurs, right ?

                           I guess we did something wrong :-)

                        Or there is a bug we never found? Could you
                          please share details about what you did?

               upgraded to 3.7.13 from 3.7.11 using yum, while at
              least one VM is running :-)

              on all 3 servers, one by one:

              yum upgrade

              systemctl stop glusterd 

              than killed glusterfsd processes using kill 

              and systemctl start glusterd

              then next server....

              after this we tried to restart VM, but it failed, because
              we forget to restart libvirtd, and it used old libraries,

              I guess this is point where we got this problem.

                               I'm
                                planning a new cluster and I would like
                                to be protected against

                                split brains.

_______________________________________________

                              Gluster-users mailing list

                              Gluster-users@xxxxxxxxxxx

                              http://www.gluster.org/mailman/listinfo/gluster-users

                      -- 

                        Pranith

        -- 

          Pranith

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users