Re: Issues with replicated gluster volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Karthik,

Please find the details below.

Please provide the following info:
1. gluster peer status

gluster peer status
Number of Peers: 2

Hostname: node1
Uuid: 0e679115-15ad-4a85-9d0a-9178471ef90
State: Peer in Cluster (Connected)

Hostname: node2
Uuid: 785a7c5b-86d3-45b9-b371-7e66e7fa88e0
State: Peer in Cluster (Connected)


gluster pool list
UUID                                    Hostname                                State
0e679115-15ad-4a85-9d0a-9178471ef90     node1 Connected
785a7c5b-86d3-45b9-b371-7e66e7fa88e0    node2                                   Connected
ec137af6-4845-4ebb-955a-fac1df9b7b6c    localhost(node3)                        Connected

2. gluster volume info glustervol

Volume Name: glustervol
Type: Replicate
Volume ID: 5422bb27-1863-47d5-b216-61751a01b759
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1:/data
Brick2: node2:/data
Brick3: node3:/data
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet

3. gluster volume status glustervol

gluster volume status glustervol
Status of volume: glustervol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data                            49152     0          Y       59739
Brick node2:/data                            49153     0          Y       3498
Brick node3:/data                            49152     0          Y       1880
Self-heal Daemon on localhost                N/A       N/A        Y       1905
Self-heal Daemon on node1                    N/A       N/A        Y       3519
Self-heal Daemon on node2                    N/A       N/A        Y       59760

Task Status of Volume glustervol
------------------------------------------------------------------------------
There are no active volume tasks

4. client log from node4 when you saw unavailability-

Below are the logs when i reboot server node3, we can see in logs that "0-glustervol-client-2: disconnected from glustervol-client-2".

Please find the complete logs below from the reboot to until the server available. I am testing high availability by just rebooting server. In real case scenario chances are there that server may not available for some hours so we just dont want to have the long down time.


[2020-06-16 05:14:25.256136] I [MSGID: 114046] [client-handshake.c:1105:client_setvolume_cbk] 0-glustervol-client-0: Connected to glustervol-client-0, attached to remote volume '/data'.
[2020-06-16 05:14:25.256179] I [MSGID: 108005] [afr-common.c:5247:__afr_handle_child_up_event] 0-glustervol-replicate-0: Subvolume 'glustervol-client-0' came back up; going online.
[2020-06-16 05:14:25.257972] I [MSGID: 114046] [client-handshake.c:1105:client_setvolume_cbk] 0-glustervol-client-1: Connected to glustervol-client-1, attached to remote volume '/data'.
[2020-06-16 05:14:25.258014] I [MSGID: 108002] [afr-common.c:5609:afr_notify] 0-glustervol-replicate-0: Client-quorum is met
[2020-06-16 05:14:25.260312] I [MSGID: 114046] [client-handshake.c:1105:client_setvolume_cbk] 0-glustervol-client-2: Connected to glustervol-client-2, attached to remote volume '/data'.
[2020-06-16 05:14:25.261935] I [fuse-bridge.c:5145:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23
[2020-06-16 05:14:25.261957] I [fuse-bridge.c:5756:fuse_graph_sync] 0-fuse: switched to graph 0
[2020-06-16 05:16:59.729400] I [MSGID: 114018] [client.c:2331:client_rpc_notify] 0-glustervol-client-2: disconnected from glustervol-client-2. Client process will keep trying to connect to glusterd until brick's port is available
[2020-06-16 05:16:59.730053] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:08.175698 (xid=0xae)
[2020-06-16 05:16:59.730089] W [MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-glustervol-client-2: remote operation failed. Path: / (00000000-0000-0000-0000-000000000001) [Transport endpoint is not connected]
[2020-06-16 05:16:59.730336] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:10.237849 (xid=0xaf)
[2020-06-16 05:16:59.730540] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:22.694419 (xid=0xb0)
[2020-06-16 05:16:59.731132] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:27.574139 (xid=0xb1)
[2020-06-16 05:16:59.731319] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GF-DUMP) op(NULL(2)) called at 2020-06-16 05:16:34.231433 (xid=0xb2)
[2020-06-16 05:16:59.731352] W [rpc-clnt-ping.c:210:rpc_clnt_ping_cbk] 0-glustervol-client-2: socket disconnected
[2020-06-16 05:16:59.731464] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:41.213884 (xid=0xb3)
[2020-06-16 05:16:59.731650] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:48.756212 (xid=0xb4)
[2020-06-16 05:16:59.731876] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:52.258940 (xid=0xb5)
[2020-06-16 05:16:59.732060] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:54.618301 (xid=0xb6)
[2020-06-16 05:16:59.732246] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f4a6a41badb] (--> /lib64/libgfrpc.so.0(+0xd7e4)[0x7f4a6a1c27e4] (--> /lib64/libgfrpc.so.0(+0xd8fe)[0x7f4a6a1c28fe] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f4a6a1c3987] (--> /lib64/libgfrpc.so.0(+0xf518)[0x7f4a6a1c4518] ))))) 0-glustervol-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(LOOKUP(27)) called at 2020-06-16 05:16:58.288790 (xid=0xb7)
[2020-06-16 05:17:10.245302] I [rpc-clnt.c:2028:rpc_clnt_reconfig] 0-glustervol-client-2: changing port to 49152 (from 0)
[2020-06-16 05:17:10.249896] I [MSGID: 114046] [client-handshake.c:1105:client_setvolume_cbk] 0-glustervol-client-2: Connected to glustervol-client-2, attached to remote volume '/data'.

Thanks,
Ahemad

On Tuesday, 16 June, 2020, 10:10:16 am IST, Karthik Subrahmanya <ksubrahm@xxxxxxxxxx> wrote:


Hi Ahemad,

Please provide the following info:
1. gluster peer status
2. gluster volume info glustervol
3. gluster volume status glustervol
4. client log from node4 when you saw unavailability

Regards,
Karthik

On Mon, Jun 15, 2020 at 11:07 PM ahemad shaik <ahemad_shaik@xxxxxxxxx> wrote:
Hi There,

I have created 3 replica gluster volume with 3 bricks from 3 nodes.

"gluster volume create glustervol replica 3 transport tcp node1:/data node2:/data node3:/data force"

mounted on client node using below command.

"mount -t glusterfs node4:/glustervol    /mnt/"

when any of the node (either node1,node2 or node3) goes down, gluster mount/volume (/mnt) not accessible at client (node4).

purpose of replicated volume is high availability but not able to achieve it.

Is it a bug or i am missing anything.


Any suggestions will be great help!!!

kindly suggest.

Thanks,
Ahemad  
 
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux