Cannot create a simple 2-brick volume.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Gluster Community,

I'm having a terrible time just trying to get started with gluster.
I'm running Centos 5.7 on a few nodes, and have installed gluster 3.2.4 and
its prereqs from RPMs.  

Yet I'm finding it impossible to create a simple 2-brick distributed volume.
I keep seeing this error a lot:

    reading from socket failed. Error (Transport endpoint is not connected)

referring to both the localhost and peers.   There is no iptables running
on any of these machines, and all machines can ssh to each other
and report that their peers are connected.

I've googled this and other errors I've seen, and many results point
into this site, but none of the suggestions I've read have helped me.
The glusterfsd's are running.  The peers are connected.  I've done
multiple reboots and restarts of daemons.   This is a fresh install.

Details are listed below. 

Can someone please help me out? 

Thanks!
-Mark Sullivan
 Diviner Lunar Radiometer Experiment

==========================================================================================================
==========================================================================================================
==========================================================================================================

On gluster03, creating a volume "glue" which is comprised of
gluster03:/g1 and gluster04:/g1

gluster volume create glue transport tcp gluster03:/g1 gluster04:/g1
gluster volume set glue auth.allow 10.*
gluster volume start glue

The "etc*" log files show this:

[2011-11-13 16:10:22.429786] I 
[glusterd-handler.c:900:glusterd_handle_create_volume] 0-glusterd: 
Received create volume req
[2011-11-13 16:10:22.430303] I [glusterd-utils.c:243:glusterd_lock] 
0-glusterd: Cluster lock held by fb1f46cf-a03a-4fcd-b103-735040af3ced
[2011-11-13 16:10:22.430330] I 
[glusterd-handler.c:420:glusterd_op_txn_begin] 0-glusterd: Acquired 
local lock
[2011-11-13 16:10:22.430777] I 
[glusterd-rpc-ops.c:752:glusterd3_1_cluster_lock_cbk] 0-glusterd: 
Received ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:22.431182] I 
[glusterd-op-sm.c:6543:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op 
req to 1 peers
[2011-11-13 16:10:22.431814] I 
[glusterd-rpc-ops.c:1050:glusterd3_1_stage_op_cbk] 0-glusterd: Received 
ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:22.470773] I 
[glusterd-op-sm.c:6660:glusterd_op_ac_send_commit_op] 0-glusterd: Sent 
op req to 1 peers
[2011-11-13 16:10:22.489143] I 
[glusterd-rpc-ops.c:1236:glusterd3_1_commit_op_cbk] 0-glusterd: Received 
ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:22.489566] I 
[glusterd-rpc-ops.c:811:glusterd3_1_cluster_unlock_cbk] 0-glusterd: 
Received ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:22.489604] I 
[glusterd-op-sm.c:7077:glusterd_op_txn_complete] 0-glusterd: Cleared 
local lock
[2011-11-13 16:10:22.492971] W 
[socket.c:1494:__socket_proto_state_machine] 0-socket.management: 
reading from socket failed. Error (Transport endpoint is not connected), 
peer (127.0.0.1:1023)
[2011-11-13 16:10:22.611682] I [glusterd-utils.c:243:glusterd_lock] 
0-glusterd: Cluster lock held by fb1f46cf-a03a-4fcd-b103-735040af3ced
[2011-11-13 16:10:22.611709] I 
[glusterd-handler.c:420:glusterd_op_txn_begin] 0-glusterd: Acquired 
local lock
[2011-11-13 16:10:22.612096] I 
[glusterd-rpc-ops.c:752:glusterd3_1_cluster_lock_cbk] 0-glusterd: 
Received ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:22.896543] I 
[glusterd-op-sm.c:6543:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op 
req to 1 peers
[2011-11-13 16:10:23.55185] I 
[glusterd-rpc-ops.c:1050:glusterd3_1_stage_op_cbk] 0-glusterd: Received 
ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:23.64798] I 
[glusterd-op-sm.c:6660:glusterd_op_ac_send_commit_op] 0-glusterd: Sent 
op req to 1 peers
[2011-11-13 16:10:23.74209] I 
[glusterd-rpc-ops.c:1236:glusterd3_1_commit_op_cbk] 0-glusterd: Received 
ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:23.74527] I 
[glusterd-rpc-ops.c:811:glusterd3_1_cluster_unlock_cbk] 0-glusterd: 
Received ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:23.74558] I 
[glusterd-op-sm.c:7077:glusterd_op_txn_complete] 0-glusterd: Cleared 
local lock
[2011-11-13 16:10:23.79190] W 
[socket.c:1494:__socket_proto_state_machine] 0-socket.management: 
reading from socket failed. Error (Transport endpoint is not connected), 
peer (127.0.0.1:1020)
[2011-11-13 16:10:23.198846] I 
[glusterd-handler.c:1078:glusterd_handle_cli_start_volume] 0-glusterd: 
Received start vol reqfor volume glue
[2011-11-13 16:10:23.198913] I [glusterd-utils.c:243:glusterd_lock] 
0-glusterd: Cluster lock held by fb1f46cf-a03a-4fcd-b103-735040af3ced
[2011-11-13 16:10:23.198938] I 
[glusterd-handler.c:420:glusterd_op_txn_begin] 0-glusterd: Acquired 
local lock
[2011-11-13 16:10:23.199364] I 
[glusterd-rpc-ops.c:752:glusterd3_1_cluster_lock_cbk] 0-glusterd: 
Received ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:23.199819] I 
[glusterd-op-sm.c:6543:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op 
req to 1 peers
[2011-11-13 16:10:23.200396] I 
[glusterd-rpc-ops.c:1050:glusterd3_1_stage_op_cbk] 0-glusterd: Received 
ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:23.724138] I 
[glusterd-utils.c:1095:glusterd_volume_start_glusterfs] 0-: About to 
start glusterfs for brick gluster03:/g1
[2011-11-13 16:10:23.989454] I 
[glusterd-op-sm.c:6660:glusterd_op_ac_send_commit_op] 0-glusterd: Sent 
op req to 1 peers
[2011-11-13 16:10:24.7044] I [glusterd-pmap.c:237:pmap_registry_bind] 
0-pmap: adding brick /g1 on port 24009
[2011-11-13 16:10:24.39658] W 
[socket.c:1494:__socket_proto_state_machine] 0-socket.management: 
reading from socket failed. Error (Transport endpoint is not connected), 
peer (127.0.0.1:1017)
[2011-11-13 16:10:24.816411] I 
[glusterd-rpc-ops.c:1236:glusterd3_1_commit_op_cbk] 0-glusterd: Received 
ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:24.816940] I 
[glusterd-rpc-ops.c:811:glusterd3_1_cluster_unlock_cbk] 0-glusterd: 
Received ACC from uuid: 7c9ee90c-91a5-45c0-aaf9-8b8a7347b67d
[2011-11-13 16:10:24.816993] I 
[glusterd-op-sm.c:7077:glusterd_op_txn_complete] 0-glusterd: Cleared 
local lock
[2011-11-13 16:10:24.818726] W 
[socket.c:1494:__socket_proto_state_machine] 0-socket.management: 
reading from socket failed. Error (Transport endpoint is not connected), 
peer (127.0.0.1:1019)
[2011-11-13 16:10:24.859565] W 
[socket.c:1494:__socket_proto_state_machine] 0-socket.management: 
reading from socket failed. Error (Transport endpoint is not connected), 
peer (10.1.1.24:1019)

==========================================================================================================

My volume info looks okay, I guess...

gluster volume info

Volume Name: glue
Type: Distribute
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gluster03:/g1
Brick2: gluster04:/g1
Options Reconfigured:
auth.allow: 10.*

When I mount the volume "glue" on gluster03 using "mount -t nfs 
gluster03:/glue /mnt", the nfs.log shows:

[2011-11-13 16:18:06.83447] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:18:06.83507] I [dht-common.c:478:dht_revalidate_cbk] 
0-glue-dht: subvolume glue-client-0 for / returned -1 (Invalid argument)
[2011-11-13 16:18:06.84676] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:18:06.84704] I [dht-common.c:478:dht_revalidate_cbk] 
0-glue-dht: subvolume glue-client-0 for / returned -1 (Invalid argument)
[2011-11-13 16:18:06.85687] W [rpc-common.c:64:xdr_to_generic] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_notify+0x8d) 
[0x2ae52ccad6fd] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2) 
[0x2ae52ccad502] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/protocol/client.so(client3_1_stat_cbk+0x91) 
[0x2aaaaaacccb1]))) 0-xdr: XDR decoding failed
[2011-11-13 16:18:06.85723] E [client3_1-fops.c:398:client3_1_stat_cbk] 
0-glue-client-0: error
[2011-11-13 16:18:06.85748] I [client3_1-fops.c:411:client3_1_stat_cbk] 
0-glue-client-0: remote operation failed: Invalid argument
[2011-11-13 16:18:06.86273] W [rpc-common.c:64:xdr_to_generic] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_notify+0x8d) 
[0x2ae52ccad6fd] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2) 
[0x2ae52ccad502] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/protocol/client.so(client3_1_stat_cbk+0x91) 
[0x2aaaaaacccb1]))) 0-xdr: XDR decoding failed
[2011-11-13 16:18:06.86301] E [client3_1-fops.c:398:client3_1_stat_cbk] 
0-glue-client-0: error
[2011-11-13 16:18:06.86324] I [client3_1-fops.c:411:client3_1_stat_cbk] 
0-glue-client-0: remote operation failed: Invalid argument

==========================================================================================================

When I do "touch /mnt/new", I get "No such file or directory", and 
nfs.log shows:

[2011-11-13 16:18:06.83447] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:18:06.83507] I [dht-common.c:478:dht_revalidate_cbk] 
0-glue-dht: subvolume glue-client-0 for / returned -1 (Invalid argument)
[2011-11-13 16:18:06.84676] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:18:06.84704] I [dht-common.c:478:dht_revalidate_cbk] 
0-glue-dht: subvolume glue-client-0 for / returned -1 (Invalid argument)
[2011-11-13 16:18:06.85687] W [rpc-common.c:64:xdr_to_generic] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_notify+0x8d) 
[0x2ae52ccad6fd] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2) 
[0x2ae52ccad502] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/protocol/client.so(client3_1_stat_cbk+0x91) 
[0x2aaaaaacccb1]))) 0-xdr: XDR decoding failed
[2011-11-13 16:18:06.85723] E [client3_1-fops.c:398:client3_1_stat_cbk] 
0-glue-client-0: error
[2011-11-13 16:18:06.85748] I [client3_1-fops.c:411:client3_1_stat_cbk] 
0-glue-client-0: remote operation failed: Invalid argument
[2011-11-13 16:18:06.86273] W [rpc-common.c:64:xdr_to_generic] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_notify+0x8d) 
[0x2ae52ccad6fd] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2) 
[0x2ae52ccad502] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/protocol/client.so(client3_1_stat_cbk+0x91) 
[0x2aaaaaacccb1]))) 0-xdr: XDR decoding failed
[2011-11-13 16:18:06.86301] E [client3_1-fops.c:398:client3_1_stat_cbk] 
0-glue-client-0: error
[2011-11-13 16:18:06.86324] I [client3_1-fops.c:411:client3_1_stat_cbk] 
0-glue-client-0: remote operation failed: Invalid argument
[2011-11-13 16:19:48.424842] I [dht-layout.c:192:dht_layout_search] 
0-glue-dht: no subvolume for hash (value) = 1407928635
[2011-11-13 16:19:48.425129] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:19:48.425751] I [dht-layout.c:192:dht_layout_search] 
0-glue-dht: no subvolume for hash (value) = 1407928635
[2011-11-13 16:19:48.425991] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:19:48.449516] I [dht-layout.c:192:dht_layout_search] 
0-glue-dht: no subvolume for hash (value) = 1407928635
[2011-11-13 16:19:48.449662] E [fd.c:465:fd_unref] 
(-->/opt/glusterfs/3.2.4/lib64/libglusterfs.so.0(default_create_cbk+0xb4) 
[0x2ae52ca65cc4] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/debug/io-stats.so(io_stats_create_cbk+0x20c) 
[0x2aaaab76263c] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/nfs/server.so(nfs_fop_create_cbk+0x73) 
[0x2aaaab988a13]))) 0-fd: fd is NULL
[2011-11-13 16:19:48.449859] W [rpc-common.c:64:xdr_to_generic] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_notify+0x8d) 
[0x2ae52ccad6fd] 
(-->/opt/glusterfs/3.2.4/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2) 
[0x2ae52ccad502] 
(-->/opt/glusterfs/3.2.4/lib64/glusterfs/3.2.4/xlator/protocol/client.so(client3_1_statfs_cbk+0x7e) 
[0x2aaaaaac806e]))) 0-xdr: XDR decoding failed
[2011-11-13 16:19:48.449888] E 
[client3_1-fops.c:624:client3_1_statfs_cbk] 0-glue-client-0: error
[2011-11-13 16:19:48.449912] I 
[client3_1-fops.c:637:client3_1_statfs_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument
[2011-11-13 16:19:48.450030] I [dht-layout.c:192:dht_layout_search] 
0-glue-dht: no subvolume for hash (value) = 1407928635
[2011-11-13 16:19:48.450260] I 
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-glue-client-0: remote 
operation failed: Invalid argument

==========================================================================================================

And from the brick log g1.log, in case this helps:

[2011-11-13 21:46:05.929654] I [glusterfsd.c:1493:main] 
0-/opt/glusterfs/3.2.4/sbin/glusterfsd: Started Running 
/opt/glusterfs/3.2.4/sbin/glusterfsd version 3.2.4
[2011-11-13 21:46:05.946509] W [socket.c:419:__socket_keepalive] 
0-socket: failed to set keep idle on socket 8
[2011-11-13 21:46:05.946618] W 
[socket.c:1846:socket_server_event_handler] 0-socket.glusterfsd: Failed 
to set keep-alive: Operation not supported
[2011-11-13 21:46:06.72770] W [graph.c:291:gf_add_cmdline_options] 
0-glue-server: adding option 'listen-port' for volume 'glue-server' with 
value '24010'
[2011-11-13 21:46:06.73873] W 
[rpc-transport.c:447:validate_volume_options] 0-tcp.glue-server: option 
'listen-port' is deprecated, preferred is 
'transport.socket.listen-port', continuing with correction
[2011-11-13 21:46:06.74204] W [posix.c:4686:init] 0-glue-posix: Posix 
access control list is not supported.
Given volfile:
+------------------------------------------------------------------------------+
  1: volume glue-posix
  2:     type storage/posix
  3:     option directory /g1
  4: end-volume
  5:
  6: volume glue-access-control
  7:     type features/access-control
  8:     subvolumes glue-posix
  9: end-volume
 10:
 11: volume glue-locks
 12:     type features/locks
 13:     subvolumes glue-access-control
 14: end-volume
 15:
 16: volume glue-io-threads
 17:     type performance/io-threads
 18:     subvolumes glue-locks
 19: end-volume
 20:
 21: volume glue-marker
 22:     type features/marker
 23:     option volume-uuid 2b567c80-ab30-44b2-9b17-e67e6e679096
 24:     option timestamp-file /etc/glusterd/vols/glue/marker.tstamp
 25:     option xtime off
 26:     option quota off
 27:     subvolumes glue-io-threads
 28: end-volume
 29:
 30: volume /g1
 31:     type debug/io-stats
 32:     option latency-measurement off
 33:     option count-fop-hits off
 34:     subvolumes glue-marker
 35: end-volume
 36:
 37: volume glue-server
 38:     type protocol/server
 39:     option transport-type tcp
 40:     option auth.addr./g1.allow 10.*
 41:     subvolumes /g1
 42: end-volume

+------------------------------------------------------------------------------+
[2011-11-13 21:46:09.133670] E [authenticate.c:227:gf_authenticate] 
0-auth: no authentication module is interested in accepting 
remote-client (null)
[2011-11-13 21:46:09.133729] E [server-handshake.c:553:server_setvolume] 
0-glue-server: Cannot authenticate client from 127.0.0.1:1023 3.2.4
[2011-11-13 21:46:09.389447] I [server-handshake.c:542:server_setvolume] 
0-glue-server: accepted client from 10.1.1.24:1022 (version: 3.2.4)







[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux