I have two servers with SLES11 SP1 x86_64 and compiled last version of glusterfs 3.1.1. firewall is disabled on both nodes and they are on the same network. I put both hostnames in the hosts file, so that each node can resolv the others hostname correctly 192.168.8.104 virt-zabbix-02 192.168.8.105 virt-zabbix-03 this is my config on both nodes: "/etc/glusterfs/glusterd.vol" volume management type mgmt/glusterd option working-directory /etc/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 end-volume virt-zabbix-02# gluster peer status No peers present log: [2011-01-13 19:53:31.576554] I [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received cli list req this is okay, but then, when I want to add the other node to the cluster, the "glusterfsd" dies on "virt-zabbix-02" where I type the command and a core-dump file is generated: virt-zabbix-02# gluster peer probe virt-zabbix-03 log virt-zabbix-02: [2011-01-13 19:54:29.284735] I [glusterd-handler.c:563:glusterd_handle_cli_probe] glusterd: Received CLI probe req virt-zabbix-03 24007 [2011-01-13 19:54:29.285110] I [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to find hostname: virt-zabbix-03 [2011-01-13 19:54:29.285136] I [glusterd-handler.c:2618:glusterd_probe_begin] glusterd: Unable to find peerinfo for host: virt-zabbix-03 (24007) [2011-01-13 19:54:29.287625] W [rpc-transport.c:849:rpc_transport_load] rpc-transport: missing 'option transport-type'. defaulting to "socket" [2011-01-13 19:54:29.288496] I [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect returned 0 [2011-01-13 19:54:29.293369] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend virt-zabbix-03 found.. state: 0 [2011-01-13 19:54:29.302062] I [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received probe resp from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a, host: virt-zabbix-03 [2011-01-13 19:54:29.302097] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer by uuid [2011-01-13 19:54:29.302111] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend virt-zabbix-03 found.. state: 0 pending frames: patchset: v3.1.1 signal received: 11 time of crash: 2011-01-13 19:54:29 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.1.1 /lib64/libc.so.6(+0x329e0)[0x7f1cbbb589e0] /usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7f1cbc4c506c] /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7f1cbc4ca878] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7f1cba4203be] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7f1cba424f3b] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7f1cba40db17] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7f1cba40d675] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7f1cba4281f5] /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7f1cbc4c9a94] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7f1cbc4c9cd8] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7f1cbc4c4f2e] /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7f1cba1def9f] /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7f1cba1df0d4] /usr/lib64/libglusterfs.so.0(+0x3a384)[0x7f1cbc70b384] /usr/sbin/glusterd(main+0x23c)[0x4055dc] /lib64/libc.so.6(__libc_start_main+0xe6)[0x7f1cbbb44bc6] /usr/sbin/glusterd[0x4032c9] --------- log virt-zabbix-03: [2011-01-13 19:54:29.296723] I [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received probe from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941 [2011-01-13 19:54:29.296802] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer by uuid [2011-01-13 19:54:29.297224] I [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to find hostname: 192.168.8.104 [2011-01-13 19:54:29.297278] I [glusterd-handler.c:2401:glusterd_handle_probe_query] glusterd: Unable to find peerinfo for host: 192.168.8.104 (24007) [2011-01-13 19:54:29.300119] W [rpc-transport.c:849:rpc_transport_load] rpc-transport: missing 'option transport-type'. defaulting to "socket" [2011-01-13 19:54:29.304856] I [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect returned 0 [2011-01-13 19:54:29.304994] I [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to virt-zabbix-03, op_ret: 0, op_errno: 0, ret: 0 [2011-01-13 19:54:35.314773] E [socket.c:1656:socket_connect_finish] management: connection to 192.168.8.104:24007 failed (Connection refused) so I start the "gluserfsd" on virt-zabbix-02 again - a few secounds later the glusterfsd dies on the other node virt-zabbix-03 and there also a core-dump file is generated log virt-zabbix-02: [2011-01-13 19:57:08.911495] I [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received probe from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a [2011-01-13 19:57:08.911559] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer by uuid [2011-01-13 19:57:08.911643] I [glusterd-utils.c:2140:glusterd_friend_find_by_hostname] glusterd: Friend 192.168.8.105 found.. state: 0 [2011-01-13 19:57:08.911715] I [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to 192.168.8.104, op_ret: 0, op_errno: 0, ret: 0 [2011-01-13 19:57:11.956152] E [socket.c:1656:socket_connect_finish] management: connection to 192.168.8.105:24007 failed (Connection refused) log virt-zabbix-03: [2011-01-13 19:57:08.913897] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend 192.168.8.104 found.. state: 0 [2011-01-13 19:57:08.915052] I [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received probe resp from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941, host: 192.168.8.104 [2011-01-13 19:57:08.915085] I [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer by uuid [2011-01-13 19:57:08.915100] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend 192.168.8.104 found.. state: 0 pending frames: patchset: v3.1.1 signal received: 11 time of crash: 2011-01-13 19:57:08 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.1.1 /lib64/libc.so.6(+0x329e0)[0x7fe84e6ee9e0] /usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7fe84f05b06c] /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7fe84f060878] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7fe84cfb63be] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7fe84cfbaf3b] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7fe84cfa3b17] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7fe84cfa3675] /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7fe84cfbe1f5] /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7fe84f05fa94] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7fe84f05fcd8] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7fe84f05af2e] /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7fe84cd74f9f] /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7fe84cd750d4] /usr/lib64/libglusterfs.so.0(+0x3a384)[0x7fe84f2a1384] /usr/sbin/glusterd(main+0x23c)[0x4055dc] /lib64/libc.so.6(__libc_start_main+0xe6)[0x7fe84e6dabc6] /usr/sbin/glusterd[0x4032c9] --------- starting the glusterfsd on virt-zabbix-03 again, let die the glusterfsd on virt-zabbix-02 and so on so I make sure the daemon is stopped on both hosts. the peer file generated on the nodes are different one is named with the hostname, the other with the IP: virt-zabbix-02:# cat /etc/glusterd/peers/virt-zabbix-03 uuid= state=0 hostname1=virt-zabbix-03 virt-zabbix-03:# cat /etc/glusterd/peers/192.168.8.104 uuid= state=0 hostname1=192.168.8.104 so I see the uuid is empty in both files and I fill it with the uuid from each others "/etc/glusterd/glusterd.info" file: virt-zabbix-02:/ # cat /etc/glusterd/glusterd.info UUID=a9b660c5-456d-4e96-9bdd-d23c917ae941 virt-zabbix-03:/ # cat etc/glusterd/glusterd.info UUID=255540da-4b86-46f2-963c-3214e2c5e28a virt-zabbix-02:/ # cat /etc/glusterd/peers/virt-zabbix-03 uuid=255540da-4b86-46f2-963c-3214e2c5e28a state=0 hostname1=virt-zabbix-03 virt-zabbix-03:/ # cat /etc/glusterd/peers/192.168.8.104 uuid=a9b660c5-456d-4e96-9bdd-d23c917ae941 state=0 hostname1=192.168.8.104 now I start "glusterfsd" on both nodes again and both daemons keep running and I can type the command: virt-zabbix-02:/ # gluster peer status Number of Peers: 1 Hostname: virt-zabbix-03 Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a State: Establishing Connection (Connected) I'd like to create my first test volume: gluster volume create mytest transport tcp virt-zabbix-02:/gfs1 virt-zabbix-03:/gfs1 Creation of volume mytest has been unsuccessful Host virt-zabbix-03 not connected log virt-zabbix-02: [2011-01-13 20:11:10.706931] I [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received cli list req [2011-01-13 20:12:20.950199] I [glusterd-handler.c:785:glusterd_handle_create_volume] glusterd: Received create volume req [2011-01-13 20:12:20.950907] I [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend virt-zabbix-03 found.. state: 0 [2011-01-13 20:12:20.950935] I [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend found.. state: Establishing Connection [2011-01-13 20:12:20.950950] E [glusterd-utils.c:2324:glusterd_new_brick_validate] glusterd: Host virt-zabbix-03 not connected [2011-01-13 20:12:20.951005] E [glusterd-handler.c:906:glusterd_handle_create_volume] glusterd: Unlock on opinfo failed no logfiles on virt-zabbix-03 not connected? strange! status info again: virt-zabbix-02:/ # gluster peer status Number of Peers: 1 Hostname: virt-zabbix-03 Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a State: Establishing Connection (Connected) log virt-zabbix-02: [2011-01-13 20:13:24.601901] I [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received cli list req so I restart the glusterfsd on virt-zabbix-03 and the daemon on virt-zabbix-02 dies again has some one any idea whats going wrong? kind regards