I was attempting the same on a local sandbox and also have the same problem. Current: 3.8.4 Volume Name: shchst01 Type: Distributed-Replicate Volume ID: bcd53e52-cde6-4e58-85f9-71d230b7b0d3 Status: Started Snapshot Count: 0 Number of Bricks: 4 x 3 = 12 Transport-type: tcp Bricks: Brick1: shchhv01-sto:/data/brick3/shchst01 Brick2: shchhv02-sto:/data/brick3/shchst01 Brick3: shchhv03-sto:/data/brick3/shchst01 Brick4: shchhv01-sto:/data/brick1/shchst01 Brick5: shchhv02-sto:/data/brick1/shchst01 Brick6: shchhv03-sto:/data/brick1/shchst01 Brick7: shchhv02-sto:/data/brick2/shchst01 Brick8: shchhv03-sto:/data/brick2/shchst01 Brick9: shchhv04-sto:/data/brick2/shchst01 Brick10: shchhv02-sto:/data/brick4/shchst01 Brick11: shchhv03-sto:/data/brick4/shchst01 Brick12: shchhv04-sto:/data/brick4/shchst01 Options Reconfigured: cluster.data-self-heal-algorithm: full features.shard-block-size: 512MB features.shard: enable performance.readdir-ahead: on storage.owner-uid: 9869 storage.owner-gid: 9869 server.allow-insecure: on performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable cluster.quorum-type: auto cluster.server-quorum-type: server cluster.self-heal-daemon: on nfs.disable: on performance.io-thread-count: 64 performance.cache-size: 1GB Upgraded shchhv01-sto to 3.12.3, others remain at 3.8.4 RESULT ===================== Hostname: shchhv01-sto Uuid: f6205edb-a0ea-4247-9594-c4cdc0d05816 State: Peer Rejected (Connected) Upgraded Server: shchhv01-sto ============================== [2017-12-20 05:02:44.747313] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-12-20 05:02:44.747387] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2017-12-20 05:02:44.749087] W [rpc-clnt-ping.c:246:rpc_clnt_ping_cbk] 0-management: RPC_CLNT_PING notify failed [2017-12-20 05:02:44.749165] W [rpc-clnt-ping.c:246:rpc_clnt_ping_cbk] 0-management: RPC_CLNT_PING notify failed [2017-12-20 05:02:44.749563] W [rpc-clnt-ping.c:246:rpc_clnt_ping_cbk] 0-management: RPC_CLNT_PING notify failed [2017-12-20 05:02:54.676324] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: 546503ae-ba0e-40d4-843f-c5dbac22d272, host: shchhv02-sto, port: 0 [2017-12-20 05:02:54.690237] I [MSGID: 106163] [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-12-20 05:02:54.695823] I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 546503ae-ba0e-40d4-843f-c5dbac22d272 [2017-12-20 05:02:54.696956] E [MSGID: 106010] [glusterd-utils.c:3370:glusterd_compare_friend_volume] 0-management: Version of Cksums shchst01-sto differ. local cksum = 4218452135, remote cksum = 2747317484 on peer shchhv02-sto [2017-12-20 05:02:54.697796] I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to shchhv02-sto (0), ret: 0, op_ret: -1 [2017-12-20 05:02:55.033822] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: 3de22cb5-c1c1-4041-a1e1-eb969afa9b4b, host: shchhv03-sto, port: 0 [2017-12-20 05:02:55.038460] I [MSGID: 106163] [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-12-20 05:02:55.040032] I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 3de22cb5-c1c1-4041-a1e1-eb969afa9b4b [2017-12-20 05:02:55.040266] E [MSGID: 106010] [glusterd-utils.c:3370:glusterd_compare_friend_volume] 0-management: Version of Cksums shchst01-sto differ. local cksum = 4218452135, remote cksum = 2747317484 on peer shchhv03-sto [2017-12-20 05:02:55.040405] I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to shchhv03-sto (0), ret: 0, op_ret: -1 [2017-12-20 05:02:55.584854] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: 36306e37-d7f0-4fec-9140-0d0f1bd2d2d5, host: shchhv04-sto, port: 0 [2017-12-20 05:02:55.595125] I [MSGID: 106163] [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-12-20 05:02:55.600804] I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 36306e37-d7f0-4fec-9140-0d0f1bd2d2d5 [2017-12-20 05:02:55.601288] E [MSGID: 106010] [glusterd-utils.c:3370:glusterd_compare_friend_volume] 0-management: Version of Cksums shchst01-sto differ. local cksum = 4218452135, remote cksum = 2747317484 on peer shchhv04-sto [2017-12-20 05:02:55.601497] I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to shchhv04-sto (0), ret: 0, op_ret: -1 Another Server: shchhv02-sto ============================== [2017-12-20 05:02:44.667833] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f75fdc12e5c] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x27a08) [0x7f75fdc1ca08] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f75fdcc57fa] ) 0-management: Lock for vol shchst01-sto not held [2017-12-20 05:02:44.667795] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <shchhv01-sto> (<f6205edb-a0ea-4247-9594-c4cdc0d05816>), in state <Peer Rejected>, has disconnected from glusterd. [2017-12-20 05:02:44.667948] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for shchst01-sto [2017-12-20 05:02:44.760103] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-12-20 05:02:44.765389] I [MSGID: 106490] [glusterd-handler.c:2608:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: f6205edb-a0ea-4247-9594-c4cdc0d05816 [2017-12-20 05:02:54.686185] E [MSGID: 106010] [glusterd-utils.c:2930:glusterd_compare_friend_volume] 0-management: Version of Cksums shchst01 differ. local cksum = 2747317484, remote cksum = 4218452135 on peer shchhv01-sto [2017-12-20 05:02:54.686882] I [MSGID: 106493] [glusterd-handler.c:3852:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to shchhv01-sto (0), ret: 0, op_ret: -1 [2017-12-20 05:02:54.717854] I [MSGID: 106493] [glusterd-rpc-ops.c:476:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: f6205edb-a0ea-4247-9594-c4cdc0d05816, host: shchhv01-sto, port: 0 Another Server: shchhv04-sto ============================== [2017-12-20 05:02:44.667620] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <shchhv01-sto> (<f6205edb-a0ea-4247-9594-c4cdc0d05816>), in state <Peer Rejected>, has disconnected from glusterd. [2017-12-20 05:02:44.667808] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f10a33d9e5c] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x27a08) [0x7f10a33e3a08] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f10a348c7fa] ) 0-management: Lock for vol shchst01-sto not held [2017-12-20 05:02:44.667827] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for shchst01-sto [2017-12-20 05:02:44.760077] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-12-20 05:02:44.768796] I [MSGID: 106490] [glusterd-handler.c:2608:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: f6205edb-a0ea-4247-9594-c4cdc0d05816 [2017-12-20 05:02:55.595095] E [MSGID: 106010] [glusterd-utils.c:2930:glusterd_compare_friend_volume] 0-management: Version of Cksums shchst01-sto differ. local cksum = 2747317484, remote cksum = 4218452135 on peer shchhv01-sto [2017-12-20 05:02:55.595273] I [MSGID: 106493] [glusterd-handler.c:3852:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to shchhv01-sto (0), ret: 0, op_ret: -1 [2017-12-20 05:02:55.612957] I [MSGID: 106493] [glusterd-rpc-ops.c:476:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: f6205edb-a0ea-4247-9594-c4cdc0d05816, host: shchhv01-sto, port: 0 <vol>/info Upgraded Server: shchst01-sto ========================= type=2 count=12 status=1 sub_count=3 stripe_count=1 replica_count=3 disperse_count=0 redundancy_count=0 version=52 transport-type=0 volume-id=bcd53e52-cde6-4e58-85f9-71d230b7b0d3 username=5a4ae8d8-dbcb-408e-ab73-629255c14ffc password=58652573-0955-4d00-893a-9f42d0f16717 op-version=30700 client-op-version=30700 quota-version=0 tier-enabled=0 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 cluster.data-self-heal-algorithm=full features.shard-block-size=512MB features.shard=enable nfs.disable=on cluster.self-heal-daemon=on cluster.server-quorum-type=server cluster.quorum-type=auto network.remote-dio=enable cluster.eager-lock=enable performance.stat-prefetch=off performance.io-cache=off performance.read-ahead=off performance.quick-read=off server.allow-insecure=on storage.owner-gid=9869 storage.owner-uid=9869 performance.readdir-ahead=on performance.io-thread-count=64 performance.cache-size=1GB brick-0=shchhv01-sto:-data-brick3-shchst01 brick-1=shchhv02-sto:-data-brick3-shchst01 brick-2=shchhv03-sto:-data-brick3-shchst01 brick-3=shchhv01-sto:-data-brick1-shchst01 brick-4=shchhv02-sto:-data-brick1-shchst01 brick-5=shchhv03-sto:-data-brick1-shchst01 brick-6=shchhv02-sto:-data-brick2-shchst01 brick-7=shchhv03-sto:-data-brick2-shchst01 brick-8=shchhv04-sto:-data-brick2-shchst01 brick-9=shchhv02-sto:-data-brick4-shchst01 brick-10=shchhv03-sto:-data-brick4-shchst01 brick-11=shchhv04-sto:-data-brick4-shchst01 Another Server: shchhv02-sto ============================== type=2 count=12 status=1 sub_count=3 stripe_count=1 replica_count=3 disperse_count=0 redundancy_count=0 version=52 transport-type=0 volume-id=bcd53e52-cde6-4e58-85f9-71d230b7b0d3 username=5a4ae8d8-dbcb-408e-ab73-629255c14ffc password=58652573-0955-4d00-893a-9f42d0f16717 op-version=30700 client-op-version=30700 quota-version=0 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 cluster.data-self-heal-algorithm=full features.shard-block-size=512MB features.shard=enable performance.readdir-ahead=on storage.owner-uid=9869 storage.owner-gid=9869 server.allow-insecure=on performance.quick-read=off performance.read-ahead=off performance.io-cache=off performance.stat-prefetch=off cluster.eager-lock=enable network.remote-dio=enable cluster.quorum-type=auto cluster.server-quorum-type=server cluster.self-heal-daemon=on nfs.disable=on performance.io-thread-count=64 performance.cache-size=1GB brick-0=shchhv01-sto:-data-brick3-shchst01 brick-1=shchhv02-sto:-data-brick3-shchst01 brick-2=shchhv03-sto:-data-brick3-shchst01 brick-3=shchhv01-sto:-data-brick1-shchst01 brick-4=shchhv02-sto:-data-brick1-shchst01 brick-5=shchhv03-sto:-data-brick1-shchst01 brick-6=shchhv02-sto:-data-brick2-shchst01 brick-7=shchhv03-sto:-data-brick2-shchst01 brick-8=shchhv04-sto:-data-brick2-shchst01 brick-9=shchhv02-sto:-data-brick4-shchst01 brick-10=shchhv03-sto:-data-brick4-shchst01 brick-11=shchhv04-sto:-data-brick4-shchst01 NOTE [root@shchhv01 shchst01]# gluster volume get shchst01 cluster.op-version Warning: Support to get global option value using `volume get <volname>` will be deprecated from next release. Consider using `volume get all` instead for global options Option Value ------ ----- cluster.op-version 30800 [root@shchhv02 shchst01]# gluster volume get shchst01 cluster.op-version Option Value ------ ----- cluster.op-version 30800 -----Original Message----- From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users-bounces@xxxxxxxxxxx] On Behalf Of Ziemowit Pierzycki Sent: Tuesday, December 19, 2017 3:56 PM To: gluster-users <gluster-users@xxxxxxxxxxx> Subject: Re: Upgrading from Gluster 3.8 to 3.12 I have not done the upgrade yet. Since this is a production cluster I need to make sure it stays up or schedule some downtime if it doesn't doesn't. Thanks. On Tue, Dec 19, 2017 at 10:11 AM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote: > > > On Tue, Dec 19, 2017 at 1:10 AM, Ziemowit Pierzycki > <ziemowit@xxxxxxxxxxxxx> > wrote: >> >> Hi, >> >> I have a cluster of 10 servers all running Fedora 24 along with >> Gluster 3.8. I'm planning on doing rolling upgrades to Fedora 27 >> with Gluster 3.12. I saw the documentation and did some testing but >> I would like to run my plan through some (more?) educated minds. >> >> The current setup is: >> >> Volume Name: vol0 >> Distributed-Replicate >> Number of Bricks: 2 x (2 + 1) = 6 >> Bricks: >> Brick1: glt01:/vol/vol0 >> Brick2: glt02:/vol/vol0 >> Brick3: glt05:/vol/vol0 (arbiter) >> Brick4: glt03:/vol/vol0 >> Brick5: glt04:/vol/vol0 >> Brick6: glt06:/vol/vol0 (arbiter) >> >> Volume Name: vol1 >> Distributed-Replicate >> Number of Bricks: 2 x (2 + 1) = 6 >> Bricks: >> Brick1: glt07:/vol/vol1 >> Brick2: glt08:/vol/vol1 >> Brick3: glt05:/vol/vol1 (arbiter) >> Brick4: glt09:/vol/vol1 >> Brick5: glt10:/vol/vol1 >> Brick6: glt06:/vol/vol1 (arbiter) >> >> After performing the upgrade because of differences in checksums, the >> upgraded nodes will become: >> >> State: Peer Rejected (Connected) > > > Have you upgraded all the nodes? If yes, have you bumped up the > cluster.op-version after upgrading all the nodes? Please follow : > http://docs.gluster.org/en/latest/Upgrade-Guide/op_version/ for more > details on how to bump up the cluster.op-version. In case you have > done all of these and you're seeing a checksum issue then I'm afraid > you have hit a bug. I'd need further details like the checksum > mismatch error from glusterd.log file along with the the exact > volume's info file from /var/lib/glusterd/vols/<volname>/info between > both the peers to debug this further. > >> >> If I start doing the upgrades one at a time, with nodes glt10 to >> glt01 except for the arbiters glt05 and glt06, and then upgrading the >> arbiters last, everything should remain online at all times through >> the process. Correct? >> >> Thanks. >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users@xxxxxxxxxxx >> http://lists.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users