Hi, I have a ~50 node cluster. I configured gluster so that there are 2 volumes: One is configured on top of HDD, and the other one is configured on top of RAM. [root@nmIDPP20 ~]# gluster volume info Volume Name: ram Type: Distributed-Replicate Volume ID: a97fa262-276b-41e9-8f59-40f28451f689 Status: Started Number of Bricks: 5 x 2 = 10 Transport-type: tcp Bricks: Brick1: 10.238.0.15:/mnt/ram/data Brick2: 10.238.0.16:/mnt/ram/data Brick3: 10.238.0.17:/mnt/ram/data Brick4: 10.238.0.20:/mnt/ram/data Brick5: 10.238.0.19:/mnt/ram/data Brick6: 10.238.0.28:/mnt/ram/data Brick7: 10.238.0.27:/mnt/ram/data Brick8: 10.238.0.21:/mnt/ram/data Brick9: 10.238.0.24:/mnt/ram/data Brick10: 10.238.0.26:/mnt/ram/data Volume Name: disk Type: Replicate Volume ID: 9607ae5f-0dbf-4164-b260-5d9ce26d4fc7 Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.238.0.18:/var/cache/gluster/data/options/pp/data Brick2: 10.238.0.16:/var/cache/gluster/data/options/pp/data Brick3: 10.238.0.17:/var/cache/gluster/data/options/pp/data I’ve bare metaled one of the servers: 10.238.0.22. And, now trying to add it to the pool. So, after
gluster peer probe 10.238.0.22 command, we can see that it’s in pool: [root@nmIDPP20 ~]# gluster pool list UUID Hostname State baa648a5-ff35-44e0-80ea-a55e43154d12 10.238.0.50 Connected
20bb470a-85da-4e3a-a66b-08a935c189ae 10.238.0.26 Connected
79dffcf8-8c3a-47b5-926a-39be2c1406da 10.238.0.13 Disconnected
7212e375-76a4-46c9-8bac-7470e2e5a910 10.238.0.17 Connected
c6080a14-33d7-4012-8940-2d9232752551 10.238.0.14 Connected
b553ed3c-21f1-4110-808d-4b08e6ded200 10.238.0.28 Connected
5e596931-9151-4f5b-bc57-feb6fe46054f 10.238.0.7 Connected
8e1128ed-df07-4747-812e-dcc280fce5c1 10.238.0.16 Connected
0b5fae30-e169-42ee-8f39-678d6fc93ac2 10.238.0.19 Connected
0f82df55-3994-4561-8a0a-1c1d2e9c3cff 10.238.0.29 Connected
446ea1e4-61b9-4881-9073-6aeb9a154710 10.238.0.24 Connected
bcf84149-415b-4eb7-8dc1-2b284e135307 10.238.0.27 Connected
97dddf9f-0b57-4bb8-86fd-196cb51df4b6 10.238.0.20 Connected
b2bf8b3c-890b-423b-b901-f16f1186c3e6 10.238.0.4 Connected
878ba732-0fea-4734-b1bc-a08ad7a2c97a 10.238.0.9 Connected
51750fb0-c182-4e76-821f-16cee23fdf27 10.238.0.6 Connected
b162e108-4301-47df-875f-92151244b694 10.238.0.8 Connected
25d29db8-0916-4ef4-80d1-34fbf8aa5d26 10.238.0.21 Connected
9acfb879-7df9-4c87-aa1c-eb518b9c668d 10.238.0.12 Connected
aacd1fa1-940c-4cec-9b04-1fb49348e764 10.238.0.49 Connected
5c36b282-9842-4b85-8d0f-e5101817dfe1 10.238.0.18 Connected
a5298a13-144d-46e1-856f-91ade6649840 10.238.0.10 Connected
4e7b83bd-367e-419d-aa5b-34947021dbc3 10.238.0.48 Connected
6aa7957f-be6f-4bee-a748-32937d3ababd 10.238.0.47 Connected
3890ac7d-7959-4565-86de-fc792cc357b0 10.238.0.45 Disconnected
4814a743-5b52-44ab-b169-e907082aa229 10.238.0.32 Connected
cf735cd8-75e3-413b-88c5-46e5b79f7558 10.238.0.42 Connected
b1fa7e22-2e1b-4d07-966e-3096e58e5c78 10.238.0.39 Connected
1459fce8-110c-478f-815e-89507225226e 10.238.0.34 Connected
a7b21ee9-970b-4d99-9f8f-b7e1cbf4be77 10.238.0.25 Connected
dab1a271-4244-41bc-b770-7b13bd6e399d 10.238.0.43 Connected
5b483c65-0d04-4188-85a9-77dfbbef78cd 10.238.0.41 Connected
1b8cb9d8-ce8f-49aa-b958-705dd09db073 10.238.0.40 Connected
4b4f85a0-1310-45df-a613-e33c967cc53d 10.238.0.38 Connected
dab043b8-11ba-4fa6-9b82-baa18b41167d 10.238.0.33 Disconnected
06cbc4c2-9d79-4689-9ac6-3dbc2250d903 10.238.0.30 Connected
f33451c7-e984-495c-8e34-0b2d99a21e1e 10.238.0.31 Connected
1873e2ce-1239-4b6d-930f-af14e9c1f13b 10.238.0.5 Connected
c85de12f-23e6-4797-adb4-d33b7b4eb5fc 10.238.0.11 Connected
4147639d-652e-49a8-aa8b-d77327cca9ca 10.238.0.15 Connected
07580a32-c558-449d-b454-044fb679c908 10.238.0.22 Connected
d5140e78-498d-4c63-868d-189554aef7d4 localhost Connected But, gluster peer status is giving following output: [root@nmIDPP20 ~]# gluster peer status Number of Peers: 41 Hostname: 10.238.0.50 Uuid: baa648a5-ff35-44e0-80ea-a55e43154d12 State: Peer in Cluster (Connected) Hostname: 10.238.0.26 Uuid: 20bb470a-85da-4e3a-a66b-08a935c189ae State: Peer in Cluster (Connected) Hostname: 10.238.0.13 Uuid: 79dffcf8-8c3a-47b5-926a-39be2c1406da State: Peer in Cluster (Disconnected) Hostname: 10.238.0.17 Uuid: 7212e375-76a4-46c9-8bac-7470e2e5a910 State: Peer in Cluster (Connected) Hostname: 10.238.0.14 Uuid: c6080a14-33d7-4012-8940-2d9232752551 State: Peer in Cluster (Connected) Hostname: 10.238.0.28 Uuid: b553ed3c-21f1-4110-808d-4b08e6ded200 State: Peer in Cluster (Connected) Hostname: 10.238.0.7 Uuid: 5e596931-9151-4f5b-bc57-feb6fe46054f State: Peer in Cluster (Connected) Hostname: 10.238.0.16 Uuid: 8e1128ed-df07-4747-812e-dcc280fce5c1 State: Peer in Cluster (Connected) Hostname: 10.238.0.19 Uuid: 0b5fae30-e169-42ee-8f39-678d6fc93ac2 State: Peer in Cluster (Connected) Hostname: 10.238.0.29 Uuid: 0f82df55-3994-4561-8a0a-1c1d2e9c3cff State: Peer in Cluster (Connected) Hostname: 10.238.0.24 Uuid: 446ea1e4-61b9-4881-9073-6aeb9a154710 State: Peer in Cluster (Connected) Hostname: 10.238.0.27 Uuid: bcf84149-415b-4eb7-8dc1-2b284e135307 State: Peer in Cluster (Connected) Hostname: 10.238.0.20 Uuid: 97dddf9f-0b57-4bb8-86fd-196cb51df4b6 State: Peer in Cluster (Connected) Hostname: 10.238.0.4 Uuid: b2bf8b3c-890b-423b-b901-f16f1186c3e6 State: Peer in Cluster (Connected) Hostname: 10.238.0.9 Uuid: 878ba732-0fea-4734-b1bc-a08ad7a2c97a State: Peer in Cluster (Connected) Hostname: 10.238.0.6 Uuid: 51750fb0-c182-4e76-821f-16cee23fdf27 State: Peer in Cluster (Connected) Hostname: 10.238.0.8 Uuid: b162e108-4301-47df-875f-92151244b694 State: Peer in Cluster (Connected) Hostname: 10.238.0.21 Uuid: 25d29db8-0916-4ef4-80d1-34fbf8aa5d26 State: Peer in Cluster (Connected) Hostname: 10.238.0.12 Uuid: 9acfb879-7df9-4c87-aa1c-eb518b9c668d State: Peer in Cluster (Connected) Hostname: 10.238.0.49 Uuid: aacd1fa1-940c-4cec-9b04-1fb49348e764 State: Peer in Cluster (Connected) Hostname: 10.238.0.18 Uuid: 5c36b282-9842-4b85-8d0f-e5101817dfe1 State: Peer in Cluster (Connected) Hostname: 10.238.0.10 Uuid: a5298a13-144d-46e1-856f-91ade6649840 State: Peer in Cluster (Connected) Hostname: 10.238.0.48 Uuid: 4e7b83bd-367e-419d-aa5b-34947021dbc3 State: Peer in Cluster (Connected) Hostname: 10.238.0.47 Uuid: 6aa7957f-be6f-4bee-a748-32937d3ababd State: Peer in Cluster (Connected) Hostname: 10.238.0.45 Uuid: 3890ac7d-7959-4565-86de-fc792cc357b0 State: Peer in Cluster (Disconnected) Hostname: 10.238.0.32 Uuid: 4814a743-5b52-44ab-b169-e907082aa229 State: Peer in Cluster (Connected) Hostname: 10.238.0.42 Uuid: cf735cd8-75e3-413b-88c5-46e5b79f7558 State: Peer in Cluster (Connected) Hostname: 10.238.0.39 Uuid: b1fa7e22-2e1b-4d07-966e-3096e58e5c78 State: Peer in Cluster (Connected) Hostname: 10.238.0.34 Uuid: 1459fce8-110c-478f-815e-89507225226e State: Peer in Cluster (Connected) Hostname: 10.238.0.25 Uuid: a7b21ee9-970b-4d99-9f8f-b7e1cbf4be77 State: Peer in Cluster (Connected) Hostname: 10.238.0.43 Uuid: dab1a271-4244-41bc-b770-7b13bd6e399d State: Peer in Cluster (Connected) Hostname: 10.238.0.41 Uuid: 5b483c65-0d04-4188-85a9-77dfbbef78cd State: Peer in Cluster (Connected) Hostname: 10.238.0.40 Uuid: 1b8cb9d8-ce8f-49aa-b958-705dd09db073 State: Peer in Cluster (Connected) Hostname: 10.238.0.38 Uuid: 4b4f85a0-1310-45df-a613-e33c967cc53d State: Peer in Cluster (Connected) Hostname: 10.238.0.33 Uuid: dab043b8-11ba-4fa6-9b82-baa18b41167d State: Peer in Cluster (Disconnected) Hostname: 10.238.0.30 Uuid: 06cbc4c2-9d79-4689-9ac6-3dbc2250d903 State: Peer in Cluster (Connected) Hostname: 10.238.0.31 Uuid: f33451c7-e984-495c-8e34-0b2d99a21e1e State: Peer in Cluster (Connected) Hostname: 10.238.0.5 Uuid: 1873e2ce-1239-4b6d-930f-af14e9c1f13b State: Peer in Cluster (Connected) Hostname: 10.238.0.11 Uuid: c85de12f-23e6-4797-adb4-d33b7b4eb5fc State: Peer in Cluster (Connected) Hostname: 10.238.0.15 Uuid: 4147639d-652e-49a8-aa8b-d77327cca9ca State: Peer in Cluster (Connected) Hostname: 10.238.0.22 Uuid: 07580a32-c558-449d-b454-044fb679c908 State: Probe Sent to Peer (Connected) And after staying in this state for about 10 min, .22 node disappears from pool list. Also, during peer probe, on node .22 if you do
gluster pool list , it hangs and does not do anything. Only, after few mins it releases the shell, and outputs nothing. I’ve tried to do couple of things to resolve the issue: 1.
Disabled firewall -> didn’t help 2.
Removed mgmt directory from 22, restarted gluster service and glusterfs/d processes -> didn’t help 3.
Tried to probe .22 from another server -> didn’t help 4.
Reset uuid of .22 -> didn’t help I don’t know what I can do more, so asking for support from you. Following are logs during probe from .22 and .23: 10.238.0.22:/ [2016-05-06 19:45:24.463346] I [glusterd-handler.c:1114:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2016-05-06 19:46:01.295054] I [glusterd-handler.c:1114:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2016-05-06 19:46:50.518018] I [glusterd-handshake.c:563:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30501 [2016-05-06 19:46:50.521829] I [glusterd-handler.c:2346:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: d5140e78-498d-4c63-868d-189554aef7d4 [2016-05-06 19:47:10.542419] I [glusterd-handler.c:2374:__glusterd_handle_probe_query] 0-glusterd: Unable to find peerinfo for host: 10.238.0.23 (24007) [2016-05-06 19:47:10.548116] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2016-05-06 19:47:10.548218] I [socket.c:3561:socket_init] 0-management: SSL support is NOT enabled [2016-05-06 19:47:10.548239] I [socket.c:3576:socket_init] 0-management: using system polling thread [2016-05-06 19:47:10.553769] I [glusterd-handler.c:2912:glusterd_friend_add] 0-management: connect returned 0 [2016-05-06 19:47:10.553886] I [glusterd-handler.c:2398:__glusterd_handle_probe_query] 0-glusterd: Responded to 10.238.0.23, op_ret: 0, op_errno: 0, ret: 0 [2016-05-06 19:47:10.554650] I [glusterd-handler.c:2050:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: d5140e78-498d-4c63-868d-189554aef7d4 [2016-05-06 19:50:50.812036] E [glusterd-utils.c:4692:glusterd_brick_start] 0-management: Could not find peer on which brick 10.238.0.15:/mnt/ram/data resides 10.238.0.23: [2016-05-06 19:46:28.982091] I [glusterd-handler.c:1114:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2016-05-06 19:46:31.930017] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:34.934960] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:37.916015] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:40.947036] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:43.950373] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:46.961104] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:49.966875] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:50.497510] I [glusterd-handler.c:918:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 10.238.0.22 24007 [2016-05-06 19:46:50.502555] I [glusterd-handler.c:2931:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 10.238.0.22 (24007) [2016-05-06 19:46:50.511183] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2016-05-06 19:46:50.511279] I [socket.c:3561:socket_init] 0-management: SSL support is NOT enabled [2016-05-06 19:46:50.511300] I [socket.c:3576:socket_init] 0-management: using system polling thread [2016-05-06 19:46:50.517005] I [glusterd-handler.c:2912:glusterd_friend_add] 0-management: connect returned 0 [2016-05-06 19:46:52.983838] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:55.975533] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:46:58.989536] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:01.994423] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:04.995025] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:07.995849] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:10.553738] I [glusterd-rpc-ops.c:234:__glusterd_probe_cbk] 0-glusterd: Received probe resp from uuid: 07580a32-c558-449d-b454-044fb679c908, host: 10.238.0.22 [2016-05-06 19:47:10.559641] I [glusterd-rpc-ops.c:306:__glusterd_probe_cbk] 0-glusterd: Received resp to probe req [2016-05-06 19:47:11.006166] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:14.009705] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:16.996479] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:20.024705] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:23.035546] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server [2016-05-06 19:47:26.041132] E [glusterd-handshake.c:942:__glusterd_mgmt_hndsk_version_ack_cbk] 0-management: Failed to get handshake ack from remote server Please, advice anything to handle this issue. Thanks, Azamat Phone: 703-667-8922 _____________________________________________________ This electronic message and any files transmitted with it contains information from iDirect, which may be privileged, proprietary and/or confidential. It is intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please delete it and immediately notify the sender. _____________________________________________________ |
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users