On Wed, Apr 11, 2018 at 4:35 AM, TomK <tomkcpr@xxxxxxxxxxx> wrote:
On 4/9/2018 2:45 AM, Alex K wrote:
Hey Alex,
With two nodes, the setup works but both sides go down when one node is missing. Still I set the below two params to none and that solved my issue:
cluster.quorum-type: none
cluster.server-quorum-type: none
yes this disables quorum so as to avoid the issue. Glad that this helped. Bare in in mind though that it is easier to face split-brain issues with quorum is disabled, that's why 3 nodes at least are recommended. Just to note that I have also a 2 node cluster which is running without issues for long time.
Thank you for that.
Cheers,
Tom
Hi,
You need 3 nodes at least to have quorum enabled. In 2 node setup you need to disable quorum so as to be able to still use the volume when one of the nodes go down.
<http://192.168.0.119:24007> failed (No route toOn Mon, Apr 9, 2018, 09:02 TomK <tomkcpr@xxxxxxxxxxx <mailto:tomkcpr@xxxxxxxxxxx>> wrote:
Hey All,
In a two node glusterfs setup, with one node down, can't use the second
node to mount the volume. I understand this is expected behaviour?
Anyway to allow the secondary node to function then replicate what
changed to the first (primary) when it's back online? Or should I just
go for a third node to allow for this?
Also, how safe is it to set the following to none?
cluster.quorum-type: auto
cluster.server-quorum-type: server
[root@nfs01 /]# gluster volume start gv01
volume start: gv01: failed: Quorum not met. Volume operation not
allowed.
[root@nfs01 /]#
[root@nfs01 /]# gluster volume status
Status of volume: gv01
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------ ------------------
Brick nfs01:/bricks/0/gv01 N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y
25561
Task Status of Volume gv01
------------------------------------------------------------ ------------------
There are no active volume tasks
[root@nfs01 /]#
[root@nfs01 /]# gluster volume info
Volume Name: gv01
Type: Replicate
Volume ID: e5ccc75e-5192-45ac-b410-a34ebd777666
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: nfs01:/bricks/0/gv01
Brick2: nfs02:/bricks/0/gv01
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
nfs.trusted-sync: on
performance.cache-size: 1GB
performance.io-thread-count: 16
performance.write-behind-window-size: 8MB
performance.readdir-ahead: on
client.event-threads: 8
server.event-threads: 8
cluster.quorum-type: auto
cluster.server-quorum-type: server
[root@nfs01 /]#
==> n.log <==
[2018-04-09 05:08:13.704156] I [MSGID: 100030] [glusterfsd.c:2556:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
3.13.2 (args: /usr/sbin/glusterfs --process-name fuse
--volfile-server=nfs01 --volfile-id=/gv01 /n)
[2018-04-09 05:08:13.711255] W [MSGID: 101002]
[options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is
deprecated, preferred is 'transport.address-family', continuing with
correction
[2018-04-09 05:08:13.728297] W [socket.c:3216:socket_connect]
0-glusterfs: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
[2018-04-09 05:08:13.729025] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2018-04-09 05:08:13.737757] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 2
[2018-04-09 05:08:13.738114] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 3
[2018-04-09 05:08:13.738203] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 4
[2018-04-09 05:08:13.738324] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 5
[2018-04-09 05:08:13.738330] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 6
[2018-04-09 05:08:13.738655] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 7
[2018-04-09 05:08:13.738742] I [MSGID: 101190]
[event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 8
[2018-04-09 05:08:13.739460] W [MSGID: 101174]
[graph.c:363:_log_if_unknown_option] 0-gv01-readdir-ahead: option
'parallel-readdir' is not recognized
[2018-04-09 05:08:13.739787] I [MSGID: 114020] [client.c:2360:notify]
0-gv01-client-0: parent translators are ready, attempting connect on
transport
[2018-04-09 05:08:13.747040] W [socket.c:3216:socket_connect]
0-gv01-client-0: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
[2018-04-09 05:08:13.747372] I [MSGID: 114020] [client.c:2360:notify]
0-gv01-client-1: parent translators are ready, attempting connect on
transport
[2018-04-09 05:08:13.747883] E [MSGID: 114058]
[client-handshake.c:1571:client_query_portmap_cbk] 0-gv01-client-0:
failed to get the port number for remote subvolume. Please run 'gluster
volume status' on server to see if brick process is running.
[2018-04-09 05:08:13.748026] I [MSGID: 114018]
[client.c:2285:client_rpc_notify] 0-gv01-client-0: disconnected from
gv01-client-0. Client process will keep trying to connect to glusterd
until brick's port is available
[2018-04-09 05:08:13.748070] W [MSGID: 108001]
[afr-common.c:5391:afr_notify] 0-gv01-replicate-0: Client-quorum is
not met
[2018-04-09 05:08:13.754493] W [socket.c:3216:socket_connect]
0-gv01-client-1: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
Final graph:
+----------------------------------------------------------- -------------------+
1: volume gv01-client-0
2: type protocol/client
3: option ping-timeout 42
4: option remote-host nfs01
5: option remote-subvolume /bricks/0/gv01
6: option transport-type socket
7: option transport.address-family inet
8: option username 916ccf06-dc1d-467f-bc3d-f00a7449618f
9: option password a44739e0-9587-411f-8e6a-9a6a4e46156c
10: option event-threads 8
11: option transport.tcp-user-timeout 0
12: option transport.socket.keepalive-time 20
13: option transport.socket.keepalive-interval 2
14: option transport.socket.keepalive-count 9
15: option send-gids true
16: end-volume
17:
18: volume gv01-client-1
19: type protocol/client
20: option ping-timeout 42
21: option remote-host nfs02
22: option remote-subvolume /bricks/0/gv01
23: option transport-type socket
24: option transport.address-family inet
25: option username 916ccf06-dc1d-467f-bc3d-f00a7449618f
26: option password a44739e0-9587-411f-8e6a-9a6a4e46156c
27: option event-threads 8
28: option transport.tcp-user-timeout 0
29: option transport.socket.keepalive-time 20
30: option transport.socket.keepalive-interval 2
31: option transport.socket.keepalive-count 9
32: option send-gids true
33: end-volume
34:
35: volume gv01-replicate-0
36: type cluster/replicate
37: option afr-pending-xattr gv01-client-0,gv01-client-1
38: option quorum-type auto
39: option use-compound-fops off
40: subvolumes gv01-client-0 gv01-client-1
41: end-volume
42:
43: volume gv01-dht
44: type cluster/distribute
45: option lock-migration off
46: subvolumes gv01-replicate-0
47: end-volume
48:
49: volume gv01-write-behind
50: type performance/write-behind
51: option cache-size 8MB
52: subvolumes gv01-dht
53: end-volume
54:
55: volume gv01-read-ahead
56: type performance/read-ahead
57: subvolumes gv01-write-behind
58: end-volume
59:
60: volume gv01-readdir-ahead
61: type performance/readdir-ahead
62: option parallel-readdir off
63: option rda-request-size 131072
64: option rda-cache-limit 10MB
65: subvolumes gv01-read-ahead
66: end-volume
67:
68: volume gv01-io-cache
69: type performance/io-cache
70: option cache-size 1GB
71: subvolumes gv01-readdir-ahead
72: end-volume
73:
74: volume gv01-quick-read
75: type performance/quick-read
76: option cache-size 1GB
77: subvolumes gv01-io-cache
78: end-volume
79:
80: volume gv01-open-behind
81: type performance/open-behind
82: subvolumes gv01-quick-read
83: end-volume
84:
85: volume gv01-md-cache
86: type performance/md-cache
87: subvolumes gv01-open-behind
88: end-volume
89:
90: volume gv01
91: type debug/io-stats
92: option log-level INFO
93: option latency-measurement off
94: option count-fop-hits off
95: subvolumes gv01-md-cache
96: end-volume
97:
98: volume meta-autoload
99: type meta
100: subvolumes gv01
101: end-volume
102:
+----------------------------------------------------------- -------------------+
[2018-04-09 05:08:13.922631] E [socket.c:2374:socket_connect_finish]
0-gv01-client-1: connection to 192.168.0.119:24007Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@gluster.
host); disconnecting socket
[2018-04-09 05:08:13.922690] E [MSGID: 108006]
[afr-common.c:5164:__afr_handle_child_down_event] 0-gv01-replicate-0:
All subvolumes are down. Going offline until atleast one of them comes
back up.
[2018-04-09 05:08:13.926201] I [fuse-bridge.c:4205:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24
kernel 7.22
[2018-04-09 05:08:13.926245] I [fuse-bridge.c:4835:fuse_graph_sync]
0-fuse: switched to graph 0
[2018-04-09 05:08:13.926518] I [MSGID: 108006]
[afr-common.c:5444:afr_local_init] 0-gv01-replicate-0: no subvolumes up
[2018-04-09 05:08:13.926671] E [MSGID: 101046]
[dht-common.c:1501:dht_lookup_dir_cbk] 0-gv01-dht: dict is null
[2018-04-09 05:08:13.926762] E [fuse-bridge.c:4271:fuse_first_lookup]
0-fuse: first lookup on root failed (Transport endpoint is not
connected)
[2018-04-09 05:08:13.927207] I [MSGID: 108006]
[afr-common.c:5444:afr_local_init] 0-gv01-replicate-0: no subvolumes up
[2018-04-09 05:08:13.927262] E [MSGID: 101046]
[dht-common.c:1501:dht_lookup_dir_cbk] 0-gv01-dht: dict is null
[2018-04-09 05:08:13.927301] W
[fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse:
00000000-0000-0000-0000-000000000001: failed to resolve (Transport
endpoint is not connected)
[2018-04-09 05:08:13.927339] E [fuse-bridge.c:900:fuse_getattr_resume]
0-glusterfs-fuse: 2: GETATTR 1 (00000000-0000-0000-0000-000000000001)
resolution failed
[2018-04-09 05:08:13.931497] I [MSGID: 108006]
[afr-common.c:5444:afr_local_init] 0-gv01-replicate-0: no subvolumes up
[2018-04-09 05:08:13.931558] E [MSGID: 101046]
[dht-common.c:1501:dht_lookup_dir_cbk] 0-gv01-dht: dict is null
[2018-04-09 05:08:13.931599] W
[fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse:
00000000-0000-0000-0000-000000000001: failed to resolve (Transport
endpoint is not connected)
[2018-04-09 05:08:13.931623] E [fuse-bridge.c:900:fuse_getattr_resume]
0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001)
resolution failed
[2018-04-09 05:08:13.937258] I [fuse-bridge.c:5093:fuse_thread_proc]
0-fuse: initating unmount of /n
[2018-04-09 05:08:13.938043] W [glusterfsd.c:1393:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7e25) [0x7fb80b05ae25]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x560b52471675]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x560b5247149b] ) 0-:
received signum (15), shutting down
[2018-04-09 05:08:13.938086] I [fuse-bridge.c:5855:fini] 0-fuse:
Unmounting '/n'.
[2018-04-09 05:08:13.938106] I [fuse-bridge.c:5860:fini] 0-fuse: Closing
fuse connection to '/n'.
==> glusterd.log <==
[2018-04-09 05:08:15.118078] W [socket.c:3216:socket_connect]
0-management: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
==> glustershd.log <==
[2018-04-09 05:08:15.282192] W [socket.c:3216:socket_connect]
0-gv01-client-0: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
[2018-04-09 05:08:15.289508] W [socket.c:3216:socket_connect]
0-gv01-client-1: Error disabling sockopt IPV6_V6ONLY: "Protocol not
available"
--
Cheers,
Tom K.
------------------------------------------------------------ -------------------------
Living on earth is expensive, but it includes a free trip around the
sun.
_______________________________________________
Gluster-users mailing listorg >
http://lists.gluster.org/mailman/listinfo/gluster-users
--
Cheers,
Tom K.
------------------------------------------------------------ -------------------------
Living on earth is expensive, but it includes a free trip around the sun.
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users