Hello I'm trying to find a solution to an error, maybe someone can help. I'm using Centos 7, and glusterfs 3.6.3. I've got 2 nodes on the same network and a volume replicated. If both nodes are up, the volume is OK, and I can mount it on NFS on each node. If one node is down, when I reboot the other, the volume can't be mounted. I've got the error on log "failed to get the port number for remote subvolume" : +------------------------------------------------------------------------------+ 1: volume data-sync-client-0 2: type protocol/client 3: option ping-timeout 42 4: option remote-host host1 5: option remote-subvolume /gluster 6: option transport-type socket 7: option username fbd26745-afb8-4729-801e-e1a2db8ff38f 8: option password d077f325-1d03-494d-bfe5-d662ce2d22fe 9: option send-gids true 10: end-volume 11: 12: volume data-sync-client-1 13: type protocol/client 14: option ping-timeout 42 15: option remote-host host2 16: option remote-subvolume /gluster 17: option transport-type socket 18: option username fbd26745-afb8-4729-801e-e1a2db8ff38f 19: option password d077f325-1d03-494d-bfe5-d662ce2d22fe 20: option send-gids true 21: end-volume 22: 23: volume data-sync-replicate-0 24: type cluster/replicate 25: subvolumes data-sync-client-0 data-sync-client-1 26: end-volume 27: 28: volume data-sync-dht 29: type cluster/distribute 30: subvolumes data-sync-replicate-0 31: end-volume 32: 33: volume data-sync-write-behind 34: type performance/write-behind 35: subvolumes data-sync-dht 36: end-volume 37: 38: volume data-sync-read-ahead 39: type performance/read-ahead 40: subvolumes data-sync-write-behind 41: end-volume 42: 43: volume data-sync-io-cache 44: type performance/io-cache 45: subvolumes data-sync-read-ahead 46: end-volume 47: 48: volume data-sync-quick-read 49: type performance/quick-read 50: subvolumes data-sync-io-cache 51: end-volume 52: 53: volume data-sync-open-behind 54: type performance/open-behind 55: subvolumes data-sync-quick-read 56: end-volume 57: 58: volume data-sync-md-cache 59: type performance/md-cache 60: subvolumes data-sync-open-behind 61: end-volume 62: 63: volume data-sync 64: type debug/io-stats 65: option latency-measurement off 66: option count-fop-hits off 67: subvolumes data-sync-md-cache 68: end-volume 69: 70: volume meta-autoload 71: type meta 72: subvolumes data-sync 73: end-volume 74: +------------------------------------------------------------------------------+ [2015-07-08 06:06:08.088983] E [client-handshake.c:1496:client_query_portmap_cbk] 0-data-sync-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2015-07-08 06:06:08.089034] I [client.c:2215:client_rpc_notify] 0-data-sync-client-1: disconnected from data-sync-client-1. Client process will keep trying to connect to glusterd until brick's port is available [2015-07-08 06:06:10.769962] E [socket.c:2276:socket_connect_finish] 0-data-sync-client-0: connection to 192.168.1.12:24007 failed (No route to host) [2015-07-08 06:06:10.769991] E [MSGID: 108006] [afr-common.c:3708:afr_notify] 0-data-sync-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2015-07-08 06:06:10.772310] I [fuse-bridge.c:5080:fuse_graph_setup] 0-fuse: switched to graph 0 [2015-07-08 06:06:10.772430] I [fuse-bridge.c:4009:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2015-07-08 06:06:10.772503] I [afr-common.c:3839:afr_local_init] 0-data-sync-replicate-0: no subvolumes up [2015-07-08 06:06:10.772631] I [afr-common.c:3839:afr_local_init] 0-data-sync-replicate-0: no subvolumes up [2015-07-08 06:06:10.772653] W [fuse-bridge.c:779:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not connected) [2015-07-08 06:06:10.776974] I [afr-common.c:3839:afr_local_init] 0-data-sync-replicate-0: no subvolumes up [2015-07-08 06:06:10.777810] I [fuse-bridge.c:4921:fuse_thread_proc] 0-fuse: unmounting /data-sync [2015-07-08 06:06:10.778007] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down [2015-07-08 06:06:10.778022] I [fuse-bridge.c:5599:fini] 0-fuse: Unmounting '/data-sync'. The volume is started but not online. # gluster volume status Status of volume: data-sync Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick host2:/gluster N/A N N/A NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A N N/A Task Status of Volume data-sync ------------------------------------------------------------------------------ There are no active volume tasks To resolve it I need to stop the volume, and start it, and mount. I can't find how to resolve it to each boot correctly. I saw on a bug report it's a protection, the volume will stay offline if another node is not online, to avoid stale data. Any idea to force it be online at boot ? Thanks Nicolas _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users