I have replicated my upgradable environment in a testing lab with the following configuration:
Distributed gluster (one brick per node)
- Node gluster-1: glusterfs version 3.7.4
- Node gluster-2: glusterfs version 3.7.4
- Node gluster-3: glusterfs version 3.7.4
I began by upgrading only the first node to newest version (3.7.5).
root@gluster-1 ~]# gluster --version
Distributed gluster (one brick per node)
- Node gluster-1: glusterfs version 3.7.4
- Node gluster-2: glusterfs version 3.7.4
- Node gluster-3: glusterfs version 3.7.4
I began by upgrading only the first node to newest version (3.7.5).
root@gluster-1 ~]# gluster --version
glusterfs 3.7.5 built on Oct 7 2015 16:27:05
When I tried to request the status of gluster volume, I got these error messages:
When I tried to request the status of gluster volume, I got these error messages:
[root@gluster-1 ~]# gluster volume status
Staging failed on gluster-2. Please check log file for details.
Staging failed on gluster-3. Please check log file for details.
In node gluster-2, tailed messages from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log:
In node gluster-2, tailed messages from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log:
[2015-10-26 10:50:16.378672] E [MSGID: 106062] [glusterd-volume-ops.c:1796:glusterd_op_stage_heal_volume] 0-glusterd: Unable to get volume name
[2015-10-26 10:50:16.378735] E [MSGID: 106301] [glusterd-op-sm.c:5171:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Heal', Status : -2
On the other hand, if I upgrade all the nodes at the same time, everything seems working fine!
The issue may be when nodes have different versions (3.7.4 and 3.7.5).
Is this a normal behavior? It is needed to stop the entire cluster?
On the other hand, if I upgrade all the nodes at the same time, everything seems working fine!
The issue may be when nodes have different versions (3.7.4 and 3.7.5).
Is this a normal behavior? It is needed to stop the entire cluster?
Regards,
.....................................................................
Juan Francisco Rodríguez Cardoso
.....................................................................
On 26 October 2015 at 11:48, Alan Orth <alan.orth@xxxxxxxxx> wrote:
Hi,We're debating updating from 3.5.x to 3.7.x soon on our 2x2 replica set and these upgrade issues are a bit worrying. Can I hear a few voices from people who have had positive experiences? :)Thanks,Alan--On Fri, Oct 23, 2015 at 6:32 PM, JuanFra Rodríguez Cardoso <jfrodriguez@xxxxxxxxxx> wrote:I had that problem too, but I'm not able to fix it. I was forced to downgrade to 3.7.4 to continue running my gluster volumes.
The upgrading process (3.7.4 -> 3.7.5) does not seem fully reliable.
Best......................................................................
Juan Francisco Rodríguez Cardoso
.....................................................................
On 16 October 2015 at 15:24, David Robinson <david.robinson@xxxxxxxxxxxxx> wrote:_______________________________________________That log was the frick one, which is the node that I upgraded. The frack one is attached. One thing I did notice was the errors below in the etc log file. The /usr/lib64/glusterfs/3.7.5 directory doesn't exist yet on frack.+------------------------------------------------------------------------------+
[2015-10-16 12:04:06.235993] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2015-10-16 12:04:06.236036] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-10-16 12:04:06.236099] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2015-10-16 12:04:09.242413] E [socket.c:2278:socket_connect_finish] 0-management: connection to 10.200.82.1:24007 failed (No route to host)
[2015-10-16 12:04:09.242504] I [MSGID: 106004] [glusterd-handler.c:5056:__glusterd_peer_rpc_notify] 0-management: Peer <frackib01.corvidtec.com> (<8ab9a966-d536-4bd1-828a-64b2d72c47ca>), in state <Peer in Cluster>, has disconnected from glusterd.
[2015-10-16 12:04:09.726895] W [socket.c:869:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 14, Invalid argument
[2015-10-16 12:04:09.726918] E [socket.c:2965:socket_connect] 0-management: Failed to set keep-alive: Invalid argument
[2015-10-16 12:04:09.902756] W [MSGID: 101095] [xlator.c:143:xlator_volopt_dynload] 0-xlator: /usr/lib64/glusterfs/3.7.5/xlator/rpc-transport/socket.so: cannot open shared object file: No such file or directory------ Original Message ------From: "Mohammed Rafi K C" <rkavunga@xxxxxxxxxx>To: "David Robinson" <drobinson@xxxxxxxxxxxxx>; "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>; "Gluster Devel" <gluster-devel@xxxxxxxxxxx>Sent: 10/16/2015 8:43:21 AMSubject: Re: 3.7.5 upgrade issuesHi David,
The logs you attached, are they from node "frackib01.corvidtec.com", if not can you attach logs from the node "frackib01.corvidtec.com" ?
Regards
Rafi KC
On 10/16/2015 05:46 PM, David Robinson wrote:I have a replica pair setup that I was trying to upgrade from 3.7.4 to 3.7.5.After upgrading the rpm packages (rpm -Uvh *.rpm) and rebooting one of the nodes, I am now receiving the following:[root@frick01 log]# gluster volume status
Staging failed on frackib01.corvidtec.com. Please check log file for details.The logs are attached and my setup is shown below. Can anyone help?[root@frick01 log]# gluster volume info
Volume Name: gfs
Type: Replicate
Volume ID: abc63b5c-bed7-4e3d-9057-00930a2d85d3
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp,rdma
Bricks:
Brick1: frickib01.corvidtec.com:/data/brick01/gfs
Brick2: frackib01.corvidtec.com:/data/brick01/gfs
Options Reconfigured:
storage.owner-gid: 100
server.allow-insecure: on
performance.readdir-ahead: on
server.event-threads: 4
client.event-threads: 4David
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-usersAlan Orth
alan.orth@xxxxxxxxx
https://alaninkenya.org
https://mjanja.ch
"In heaven all the interesting people are missing." -Friedrich NietzscheGPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel