Re: Hard Failover with Samba and Glusterfs

David Spisla <spisla80@xxxxxxxxx> · Mon, 6 May 2019 13:51:32 +0200

Hello,

I create a Bug for this issue: 
https://bugzilla.redhat.com/show_bug.cgi?id=1706842

Regards
David Spisla

Am Mi., 1. Mai 2019 um 14:46 Uhr schrieb Amar Tumballi Suryanarayan <atumball@xxxxxxxxxx>:

On Wed, Apr 17, 2019 at 1:33 PM David Spisla <spisla80@xxxxxxxxx> wrote:
Dear Gluster Community,

I have this setup: 4-Node Glusterfs v5.5 Cluster, using SAMBA/CTDB v4.8 to access the volumes (each node has a VIP)

I was testing this failover scenario:

1. Start Writing 

940 GB with small files (64K-100K)from a Win10 Client to node1
2. During the write process I hardly shutdown node1 
(where the client is connect via VIP)

 by turn off the power 

My expectation is, that the write process stops and after a while the Win10 Client offers me a Retry, so I can continue the write on different node (which has now the VIP of node1).
In past time I did this observation, but now the system shows a strange bahaviour:

The Win10 Client do nothing and the Explorer freezes, in the backend CTDB can not perform the failover and throws errors. The glusterd from node2 and node3 logs this messages:

[2019-04-16 14:47:31.828323] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive1 not held
[2019-04-16 14:47:31.828350] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive1
[2019-04-16 14:47:31.828369] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol archive2 not held
[2019-04-16 14:47:31.828376] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for archive2
[2019-04-16 14:47:31.828412] W [glusterd-locks.c:795:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x24349) [0x7f1a62fcb349] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0x2d950) [0x7f1a62fd4950] -->/usr/lib64/glusterfs/5.5/xlator/mgmt/glusterd.so(+0xe0359) [0x7f1a63087359] ) 0-management: Lock for vol gluster_shared_storage not held
[2019-04-16 14:47:31.828423] W [MSGID: 106117] [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not released for gluster_shared_storage

In my oponion Samba/CTDB can not perform the failover correctly and continue the write process because glusterfs didn't released the lock. What do you think? It seems to me like a bug because in past time the failover works correctly.

Thanks for the report David. It surely looks like a bug, and I would let some experts on this domain answer the question. One request on such thing is to file a bug (preferred) or github issue, so it can be present in system.

Regards
David Spisla

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Amar Tumballi (amarts)

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users