Hari-
I was upgrading my test cluster from 5.5 to 6 and I hit this bug ( https://bugzilla.redhat.com/show_bug.cgi?id=1694010) or something similar. In my case, the workaround did not work, and I was left with a gluster that had gone into no-quorum mode and stopped all the bricks. Wasn’t much in the logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as the newer versions, so I updated them, restarted glusterd, and suddenly the updated node showed as peer-in-cluster again. Once I updated other notes the same way, things started working again. Maybe a place to look?
My old config (all nodes): volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option ping-timeout 10 option event-threads 1 option rpc-auth-allow-insecure on # option transport.address-family inet6 # option base-port 49152 end-volume
changed to: volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option transport.socket.listen-port 24007 option transport.rdma.listen-port 24008 option ping-timeout 0 option event-threads 1 option rpc-auth-allow-insecure on # option lock-timer 180 # option transport.address-family inet6 # option base-port 49152 option max-port 60999 end-volume
the only thing I found in the glusterd logs that looks relevant was (repeated for both of the other nodes in this cluster), so no clue why it happened: [2019-04-03 20:19:16.802638] I [MSGID: 106004] [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer <ossuary-san> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state <Peer in Cluster>, has disconnected from glusterd.
Comments inline.
On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay <sankarshan.mukhopadhyay@xxxxxxxxx> wrote: > > Quite a considerable amount of detail here. Thank you! > > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham <hgowtham@xxxxxxxxxx> wrote: > > > > Hello Gluster users, > > > > As you all aware that glusterfs-6 is out, we would like to inform you > > that, we have spent a significant amount of time in testing > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. > > > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. > > There were xlators (and respective options to enable/disable them) > > added and deprecated in glusterfs-6 from various versions [1]. > > > > We had to check the following upgrade scenarios for all such options > > Identified in [1]: > > 1) option never enabled and upgraded > > 2) option enabled and then upgraded > > 3) option enabled and then disabled and then upgraded > > > > We weren't manually able to check all the combinations for all the options. > > So the options involving enabling and disabling xlators were prioritized. > > The below are the result of the ones tested. > > > > Never enabled and upgraded: > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. > > > > Enabled and upgraded: > > Tested for tier which is deprecated, It is not a recommended upgrade. > > As expected the volume won't be consumable and will have a few more > > issues as well. > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. > > > > Enabled, disabled before upgrade. > > Tested for tier with 3.12 and the upgrade went fine. > > > > There is one common issue to note in every upgrade. The node being > > upgraded is going into disconnected state. You have to flush the iptables > > and the restart glusterd on all nodes to fix this. > > > > Is this something that is written in the upgrade notes? I do not seem > to recall, if not, I'll send a PR
No this wasn't mentioned in the release notes. PRs are welcome.
> > > The testing for enabling new options is still pending. The new options > > won't cause as much issues as the deprecated ones so this was put at > > the end of the priority list. It would be nice to get contributions > > for this. > > > > Did the range of tests lead to any new issues?
Yes. In the first round of testing we found an issue and had to postpone the release of 6 until the fix was made available. https://bugzilla.redhat.com/show_bug.cgi?id=1684029
And then we tested it again after this patch was made available. and came across this: https://bugzilla.redhat.com/show_bug.cgi?id=1694010
This isn’t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release.
Have mentioned this in the second mail as to how to over this situation for now until the fix is available.
> > > For the disable testing, tier was used as it covers most of the xlator > > that was removed. And all of these tests were done on a replica 3 volume. > > > > I'm not sure if the Glusto team is reading this, but it would be > pertinent to understand if the approach you have taken can be > converted into a form of automated testing pre-release.
I don't have an answer for this, have CCed Vijay. He might have an idea.
> > > Note: This is only for upgrade testing of the newly added and removed > > xlators. Does not involve the normal tests for the xlator. > > > > If you have any questions, please feel free to reach us. > > > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing > > > > Regards, > > Hari and Sanju. > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > https://lists.gluster.org/mailman/listinfo/gluster-users
-- Regards, Hari Gowtham. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users
-- --Atin _______________________________________________Gluster-users mailing listGluster-users@xxxxxxxxxxxhttps://lists.gluster.org/mailman/listinfo/gluster-users
|