Re: Why is it not possible to mount a replicated gluster volume with one Gluster server?

Yiping Peng <barius.cn@xxxxxxxxx> · Mon, 31 Aug 2015 20:10:52 +0800

One more thing, when I do this on server1, which has been in the pool for a long time:
server1:~$ mount server1:/vol1 mountpoint
It also fails.
The log gave me:

My fault, I used localhost as endpoint.

I re-issued "mount -t glusterfs server01:/speech0 qqq"
and the log shows a lot of things like:

[2015-08-31 12:08:44.801169] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 57, Protocol not available
[2015-08-31 12:08:44.801187] E [socket.c:3019:socket_connect] 0-speech0-client-43: Failed to set keep-alive: Protocol not available
[2015-08-31 12:08:44.801305] W [socket.c:642:__socket_rwv] 0-speech0-client-43: readv on 10.88.153.25:24007 failed (Connection reset by peer)
[2015-08-31 12:08:44.801404] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fcf540db65b] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fcf53ea71b7] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fcf53ea72ce] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fcf53ea739b] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fcf53ea795f] ))))) 0-speech0-client-43: forced unwinding frame type(GF-DUMP) op(DUMP(1)) called at 2015-08-31 12:08:44.801294 (xid=0x17)
[2015-08-31 12:08:44.801423] W [MSGID: 114032] [client-handshake.c:1623:client_dump_version_cbk] 0-speech0-client-43: received RPC status error [Transport endpoint is not connected]
[2015-08-31 12:08:44.801440] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-speech0-client-43: disconnected from speech0-client-43. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-31 12:08:44.804488] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 57, Protocol not available
[2015-08-31 12:08:44.804505] E [socket.c:3019:socket_connect] 0-speech0-client-51: Failed to set keep-alive: Protocol not available
[2015-08-31 12:08:44.804775] W [socket.c:642:__socket_rwv] 0-speech0-client-51: readv on 10.88.146.19:24007 failed (Connection reset by peer)
[2015-08-31 12:08:44.804878] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fcf540db65b] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fcf53ea71b7] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fcf53ea72ce] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fcf53ea739b] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fcf53ea795f] ))))) 0-speech0-client-51: forced unwinding frame type(GF-DUMP) op(DUMP(1)) called at 2015-08-31 12:08:44.804693 (xid=0x18)
[2015-08-31 12:08:44.804898] W [MSGID: 114032] [client-handshake.c:1623:client_dump_version_cbk] 0-speech0-client-51: received RPC status error [Transport endpoint is not connected]
[2015-08-31 12:08:44.804917] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-speech0-client-51: disconnected from speech0-client-51. Client process will keep trying to connect to glusterd until brick's port is available

2015-08-31 20:06 GMT+08:00 Yiping Peng <barius.cn@xxxxxxxxx>:

I believe the following events have happened in the cluster resulting
into this situation:
1. GlusterD & brick process on node 2 was brought down
2. Node 1 was rebooted.
Strangely enough, glusterfs, glusterd and glusterfsd are running on my server. Is glusterfsd the brick process? Also server01 has not been rebooted during the whole process.

glusterfsd has the following arguments:
/usr/sbin/glusterfsd -s server01.local.net --volfile-id speech0.server01.local.net.home-glusterfs-speech0-brick0 -p /var/lib/glusterd/vols/speech0/run/server01.local.net-home-glusterfs-speech0-brick0.pid -S /var/run/gluster/6bf40a98deade9dde8b615226bc57567.socket --brick-name /home/glusterfs/speech0/brick0 -l /var/log/glusterfs/bricks/home-glusterfs-speech0-brick0.log --xlator-option *-posix.glusterd-uuid=1c33ff18-2a6a-44cf-9a04-727fc96e92be --brick-port 49159 --xlator-option speech0-server.listen-port=49159

One more thing, when I do this on server1, which has been in the pool for a long time:
server1:~$ mount server1:/vol1 mountpoint
It also fails.
The log gave me:

[2015-08-31 11:56:57.123307] I [MSGID: 100030] [glusterfsd.c:2301:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.3 (args: /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/speech0 qqq)
[2015-08-31 11:56:57.134642] W [socket.c:923:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 9, Protocol not available
[2015-08-31 11:56:57.134688] E [socket.c:3019:socket_connect] 0-glusterfs: Failed to set keep-alive: Protocol not available
[2015-08-31 11:56:57.135063] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-08-31 11:56:57.135113] E [socket.c:2332:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection reset by peer)
[2015-08-31 11:56:57.135149] E [glusterfsd-mgmt.c:1819:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: localhost (Transport endpoint is not connected)
[2015-08-31 11:56:57.135158] I [glusterfsd-mgmt.c:1825:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2015-08-31 11:56:57.135333] W [glusterfsd.c:1219:cleanup_and_exit] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3) [0x7fb5e1be39a3] -->/usr/sbin/glusterfs() [0x4099c8] -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-: received signum (1), shutting down
[2015-08-31 11:56:57.135371] I [fuse-bridge.c:5595:fini] 0-fuse: Unmounting '/home/speech/pengyiping/qqq'.
[2015-08-31 11:56:57.140640] W [glusterfsd.c:1219:cleanup_and_exit] (-->/lib64/libpthread.so.0() [0x318b207851] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-: received signum (15), shutting down

Any help is much appreciated.

2015-08-31 19:15 GMT+08:00 Atin Mukherjee <amukherj@xxxxxxxxxx>:
I believe the following events have happened in the cluster resulting

into this situation:

1. GlusterD & brick process on node 2 was brought down

2. Node 1 was rebooted.

In the above case the mount will definitely fail since the brick process

was not started as in a 2 node set up glusterd waits its peers to come

up before it starts the bricks. Could you check whether the brick

process is running or not?

Thanks,

Atin

On 08/31/2015 04:17 PM, Yiping Peng wrote:

> I've tried both: assuming server1 is already in pool, server2 is undergoing

> peer-probing

>

> server2:~$ mount server1:/vol1 mountpoint, fail;

> server2:~$ mount server2:/vol1 mountpoint, fail.

>

> Strange enough. I *should* be able to mount server1:/vol1 on server2. But

> this is not the case :(

> Maybe something is broken in the server pool, as I'm seeing disconnected

> nodes?

>

>

> 2015-08-31 18:02 GMT+08:00 Ravishankar N <ravishankar@xxxxxxxxxx>:

>

>>

>>

>> On 08/31/2015 12:53 PM, Merlin Morgenstern wrote:

>>

>> Trying to mount the brick on the same physical server with deamon running

>> on this server but not on the other server:

>>

>> @node2:~$ sudo mount -t glusterfs gs2:/volume1 /data/nfs

>> Mount failed. Please check the log file for more details.

>>

>> For mount to succeed the glusterd must be up on the node that you specify

>> as the volfile-server; gs2 in this case. You can use -o

>> backupvolfile-server=gs1 as a fallback.

>> -Ravi

>>

>> _______________________________________________

>> Gluster-users mailing list

>> Gluster-users@xxxxxxxxxxx

>> http://www.gluster.org/mailman/listinfo/gluster-users

>>

>

>

>

> _______________________________________________

> Gluster-users mailing list

> Gluster-users@xxxxxxxxxxx

> http://www.gluster.org/mailman/listinfo/gluster-users

>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users