> Note that "stripe" is not tested much and practically unmaintained. Ah, this was what I suspected. Understood. I'll be happy with "shard". Having said that, "stripe" works fine with transport=tcp. The failure reproduces with just 2 RDMA servers (with InfiniBand), one of those acts also as a client. I looked into logs. I paste lengthy logs below with hoping mail systems not automatically fold lines... Takao --- Immediately started the "gluster" interactive command, the following appeared in cli.log. The last line repeats at every 3 seconds. [2017-08-16 10:49:00.028789] I [cli.c:759:main] 0-cli: Started running gluster with version 3.10.3 [2017-08-16 10:49:00.032509] I [cli-cmd-volume.c:2320:cli_check_gsync_present] 0-: geo-replication not installed [2017-08-16 10:49:00.033038] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-08-16 10:49:00.033092] I [socket.c:2415:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-16 10:49:03.032434] I [socket.c:2415:socket_event_handler] 0-transport: EPOLLERR - disconnecting now When I do: gluster> volume create gv0 stripe 2 transport rdma gluster-s1-fdr:/data/brick1/gv0 gluster-s2-fdr:/data/brick1/gv0 volume create: gv0: success: please start the volume to access data gluster> volume start gv0 volume start: gv0: success The following appeared in glusterd.log. Note the "E" flag on the last line. [2017-08-16 10:38:48.451329] I [MSGID: 106062] [glusterd-volume-ops.c:2617:glusterd_op_start_volume] 0-management: Global dict not present. [2017-08-16 10:38:48.751913] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /data/brick1/gv0.rdma on port 49152 [2017-08-16 10:38:48.752222] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-08-16 10:38:48.915868] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600 [2017-08-16 10:38:48.915977] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped [2017-08-16 10:38:48.916008] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is stopped [2017-08-16 10:38:48.916189] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped [2017-08-16 10:38:48.916210] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is stopped [2017-08-16 10:38:48.916232] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped [2017-08-16 10:38:48.916245] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is stopped [2017-08-16 10:38:49.392687] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0xdbd7a) [0x7fbb107e5d7a] -->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0xdb83d) [0x7fbb107e583d] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fbb1bc5c385] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh --volname=gv0 --first=yes --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd [2017-08-16 10:38:49.402177] E [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0xdbd7a) [0x7fbb107e5d7a] -->/usr/lib64/glusterfs/3.10.3/xlator/mgmt/glusterd.so(+0xdb79b) [0x7fbb107e579b] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fbb1bc5c385] ) 0-management: Failed to execute script: /var/lib/glusterd/hooks/1/start/post/S30samba-start.sh --volname=gv0 --first=yes --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd Looks like this was related to Samba that I do not use. The same E error happens even I use transport=tcp. No error in brick logs. Below is what was written to data-brick1-gv0.log: [2017-08-16 10:59:24.127902] I [MSGID: 100030] [glusterfsd.c:2475:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.10.3 (args: /usr/sbin/glusterfsd -s gluster-s1-fdr --volfile-id gv0.gluster-s1-fdr.data-brick1-gv0 -p /var/lib/glusterd/vols/gv0/run/gluster-s1-fdr-data-brick1-gv0.pid -S /var/run/gluster/6b6de65a92fa07146541a9474ffa2fd2.socket --brick-name /data/brick1/gv0 -l /var/log/glusterfs/bricks/data-brick1-gv0.log --xlator-option *-posix.glusterd-uuid=5c750a8f-c45b-4a7e-af84-16c1999874b7 --brick-port 49152 --xlator-option gv0-server.listen-port=49152 --volfile-server-transport=rdma) [2017-08-16 10:59:24.134054] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-08-16 10:59:24.137118] I [rpcsvc.c:2237:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2017-08-16 10:59:24.138384] W [MSGID: 101002] [options.c:954:xl_opt_validate] 0-gv0-server: option 'listen-port' is deprecated, preferred is 'transport.rdma.listen-port', continuing with correction [2017-08-16 10:59:24.142207] I [MSGID: 121050] [ctr-helper.c:259:extract_ctr_options] 0-gfdbdatastore: CTR Xlator is disabled. [2017-08-16 10:59:24.237783] I [trash.c:2493:init] 0-gv0-trash: no option specified for 'eliminate', using NULL [2017-08-16 10:59:24.239129] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-gv0-server: option 'rpc-auth.auth-glusterfs' is not recognized [2017-08-16 10:59:24.239189] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-gv0-server: option 'rpc-auth.auth-unix' is not recognized [2017-08-16 10:59:24.239203] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-gv0-server: option 'rpc-auth.auth-null' is not recognized [2017-08-16 10:59:24.239226] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-gv0-server: option 'auth-path' is not recognized [2017-08-16 10:59:24.239235] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/data/brick1/gv0: option 'auth.addr./data/brick1/gv0.allow' is not recognized [2017-08-16 10:59:24.239251] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/data/brick1/gv0: option 'auth-path' is not recognized [2017-08-16 10:59:24.239257] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/data/brick1/gv0: option 'auth.login.2d6e8c76-47ed-4ac4-87ff-f96693f048b5.password' is not recognized [2017-08-16 10:59:24.239263] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/data/brick1/gv0: option 'auth.login./data/brick1/gv0.allow' is not recognized [2017-08-16 10:59:24.239276] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-gv0-quota: option 'timeout' is not recognized [2017-08-16 10:59:24.239311] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-gv0-trash: option 'brick-path' is not recognized Final graph: +------------------------------------------------------------------------------+ 1: volume gv0-posix 2: type storage/posix 3: option glusterd-uuid 5c750a8f-c45b-4a7e-af84-16c1999874b7 4: option directory /data/brick1/gv0 5: option volume-id 6491a59c-866f-4a1d-b21b-f894ea0e50cd 6: end-volume 7: 8: volume gv0-trash 9: type features/trash 10: option trash-dir .trashcan 11: option brick-path /data/brick1/gv0 12: option trash-internal-op off 13: subvolumes gv0-posix 14: end-volume 15: 16: volume gv0-changetimerecorder 17: type features/changetimerecorder 18: option db-type sqlite3 19: option hot-brick off 20: option db-name gv0.db 21: option db-path /data/brick1/gv0/.glusterfs/ 22: option record-exit off 23: option ctr_link_consistency off 24: option ctr_lookupheal_link_timeout 300 25: option ctr_lookupheal_inode_timeout 300 26: option record-entry on 27: option ctr-enabled off 28: option record-counters off 29: option ctr-record-metadata-heat off 30: option sql-db-cachesize 12500 31: option sql-db-wal-autocheckpoint 25000 32: subvolumes gv0-trash 33: end-volume 34: 35: volume gv0-changelog 36: type features/changelog 37: option changelog-brick /data/brick1/gv0 38: option changelog-dir /data/brick1/gv0/.glusterfs/changelogs 39: option changelog-barrier-timeout 120 40: subvolumes gv0-changetimerecorder 41: end-volume 42: 43: volume gv0-bitrot-stub 44: type features/bitrot-stub 45: option export /data/brick1/gv0 46: subvolumes gv0-changelog 47: end-volume 48: 49: volume gv0-access-control 50: type features/access-control 51: subvolumes gv0-bitrot-stub 52: end-volume 53: 54: volume gv0-locks 55: type features/locks 56: subvolumes gv0-access-control 57: end-volume 58: 59: volume gv0-worm 60: type features/worm 61: option worm off 62: option worm-file-level off 63: subvolumes gv0-locks 64: end-volume 65: 66: volume gv0-read-only 67: type features/read-only 68: option read-only off 69: subvolumes gv0-worm 70: end-volume 71: 72: volume gv0-leases 73: type features/leases 74: option leases off 75: subvolumes gv0-read-only 76: end-volume 77: 78: volume gv0-upcall 79: type features/upcall 80: option cache-invalidation off 81: subvolumes gv0-leases 82: end-volume 83: 84: volume gv0-io-threads 85: type performance/io-threads 86: subvolumes gv0-upcall 87: end-volume 88: 89: volume gv0-marker 90: type features/marker 91: option volume-uuid 6491a59c-866f-4a1d-b21b-f894ea0e50cd 92: option timestamp-file /var/lib/glusterd/vols/gv0/marker.tstamp 93: option quota-version 0 94: option xtime off 95: option gsync-force-xtime off 96: option quota off 97: option inode-quota off 98: subvolumes gv0-io-threads 99: end-volume 100: 101: volume gv0-barrier 102: type features/barrier 103: option barrier disable 104: option barrier-timeout 120 105: subvolumes gv0-marker 106: end-volume 107: 108: volume gv0-index 109: type features/index 110: option index-base /data/brick1/gv0/.glusterfs/indices 111: subvolumes gv0-barrier 112: end-volume 113: 114: volume gv0-quota 115: type features/quota 116: option volume-uuid gv0 117: option server-quota off 118: option timeout 0 119: option deem-statfs off 120: subvolumes gv0-index 121: end-volume 122: 123: volume gv0-io-stats 124: type debug/io-stats 125: option unique-id /data/brick1/gv0 126: option log-level INFO 127: option latency-measurement off 128: option count-fop-hits off 129: subvolumes gv0-quota 130: end-volume 131: 132: volume /data/brick1/gv0 133: type performance/decompounder 134: option auth.addr./data/brick1/gv0.allow * 135: option auth-path /data/brick1/gv0 136: option auth.login.2d6e8c76-47ed-4ac4-87ff-f96693f048b5.password e5fe5e7e-6722-4845-8149-edaf14065ac0 137: option auth.login./data/brick1/gv0.allow 2d6e8c76-47ed-4ac4-87ff-f96693f048b5 138: subvolumes gv0-io-stats 139: end-volume 140: 141: volume gv0-server 142: type protocol/server 143: option transport.rdma.listen-port 49152 144: option rpc-auth.auth-glusterfs on 145: option rpc-auth.auth-unix on 146: option rpc-auth.auth-null on 147: option rpc-auth-allow-insecure on 148: option transport-type rdma 149: option auth.login./data/brick1/gv0.allow 2d6e8c76-47ed-4ac4-87ff-f96693f048b5 150: option auth.login.2d6e8c76-47ed-4ac4-87ff-f96693f048b5.password e5fe5e7e-6722-4845-8149-edaf14065ac0 151: option auth-path /data/brick1/gv0 152: option auth.addr./data/brick1/gv0.allow * 153: subvolumes /data/brick1/gv0 154: end-volume 155: +------------------------------------------------------------------------------+ Anyway, gluster tells that the volume started successfully. gluster> volume info gv0 Volume Name: gv0 Type: Stripe Volume ID: 6491a59c-866f-4a1d-b21b-f894ea0e50cd Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: rdma Bricks: Brick1: gluster-s1-fdr:/data/brick1/gv0 Brick2: gluster-s2-fdr:/data/brick1/gv0 Options Reconfigured: nfs.disable: on gluster> gluster> volume status gv0 Status of volume: gv0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick gluster-s1-fdr:/data/brick1/gv0 0 49152 Y 2553 Brick gluster-s2-fdr:/data/brick1/gv0 0 49152 Y 2580 Task Status of Volume gv0 ------------------------------------------------------------------------------ There are no active volume tasks I proceed to mount. I did: [root@gluster-s1 ~]# mount -t glusterfs glusterfs-s1-fdr:/gv0 /mnt Mount failed. Please check the log file for more details. The following was written to mnt.log: [2017-08-16 11:09:08.794585] I [MSGID: 100030] [glusterfsd.c:2475:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10.3 (args: /usr/sbin/glusterfs --volfile-server=glusterfs-s1-fdr --volfile-id=/gv0 /mnt) [2017-08-16 11:09:08.949784] E [MSGID: 101075] [common-utils.c:307:gf_resolve_ip6] 0-resolver: getaddrinfo failed (unknown name or service) [2017-08-16 11:09:08.949815] E [name.c:262:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host glusterfs-s1-fdr [2017-08-16 11:09:08.949956] I [glusterfsd-mgmt.c:2134:mgmt_rpc_notify] 0-glusterfsd-mgmt: disconnected from remote-host: glusterfs-s1-fdr [2017-08-16 11:09:08.950097] I [glusterfsd-mgmt.c:2155:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers [2017-08-16 11:09:08.950105] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-08-16 11:09:08.950277] W [glusterfsd.c:1332:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(rpc_clnt_notify+0xab) [0x7fdfa46bba2b] -->/usr/sbin/glusterfs(+0x10afd) [0x7fdfa4df2afd] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7fdfa4debe4b] ) 0-: received signum (1), shutting down [2017-08-16 11:09:08.950326] I [fuse-bridge.c:5802:fini] 0-fuse: Unmounting '/mnt'. [2017-08-16 11:09:08.950582] W [glusterfsd.c:1332:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7fdfa3752dc5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7fdfa4dec025] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7fdfa4debe4b] ) 0-: received signum (15), shutting down _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users