Thanks
My plans with the servers is to reinstall them all with rhel7 and the newest gluster-version. So for now my focus is just to free glustertst03
and glustertst04 for reinstallation.
So glusterfs-rdma is not installed and I guess rdma is not working…. :-).
Anyhow. Do I need to install rdma to get rebalance working or is there a way to force using tcp. I thought that failback was tcp?
Or maybe the problem is that only rdma I mentioned in glu_linux_dr2_oracle-rebalance.vol (see below)
***** I found this in rebalance log on all nodes *********
[2017-05-03 12:22:56.188978] I [dht-shared.c:337:dht_init_regex] 0-glu_linux_dr2_oracle-dht: using regex rsync-hash-regex = ^\.(.+)\.[^.]+$
[2017-05-03 12:22:56.195108] E [rpc-transport.c:266:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.6.9/rpc-transport/rdma.so:
cannot open shared object file: No such file or directory
[2017-05-03 12:22:56.195135] W [rpc-transport.c:270:rpc_transport_load] 0-rpc-transport: volume 'glu_linux_dr2_oracle-client-7': transport-type
'rdma' is not valid or not found on this machine
[2017-05-03 12:22:56.195143] W [rpc-clnt.c:993:rpc_clnt_connection_init] 0-glu_linux_dr2_oracle-client-7: loading of new rpc-transport
failed
[2017-05-03 12:22:56.195152] I [mem-pool.c:545:mem_pool_destroy] 0-glu_linux_dr2_oracle-client-7: size=588 max=0 total=0
[2017-05-03 12:22:56.195184] I [mem-pool.c:545:mem_pool_destroy] 0-glu_linux_dr2_oracle-client-7: size=124 max=0 total=0
[2017-05-03 12:22:56.195193] E [client.c:2434:client_init_rpc] 0-glu_linux_dr2_oracle-client-7: failed to initialize RPC
[2017-05-03 12:22:56.195201] E [xlator.c:430:xlator_init] 0-glu_linux_dr2_oracle-client-7: Initialization of volume 'glu_linux_dr2_oracle-client-7'
failed, review your volfile again
[2017-05-03 12:22:56.195208] E [graph.c:322:glusterfs_graph_init] 0-glu_linux_dr2_oracle-client-7: initializing translator failed
[2017-05-03 12:22:56.195214] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed
******* Grep transport in rebalance.vol *******
# grep -A2 transport-type /var/lib/glusterd/vols/glu_linux_dr2_oracle/glu_linux_dr2_oracle-rebalance.vol
option transport-type rdma
option remote-subvolume /bricks/brick2/glu_linux_dr2_oracle
option remote-host glustoretst01.net.dr.dk
--
option transport-type rdma
option remote-subvolume /bricks/brick2/glu_linux_dr2_oracle
option remote-host glustoretst02.net.dr.dk
--
option transport-type rdma
option remote-subvolume /bricks/brick1/glu_linux_dr2_oracle
option remote-host glustoretst01.net.dr.dk
--
option transport-type rdma
option remote-subvolume /bricks/brick1/glu_linux_dr2_oracle
option remote-host glustoretst02.net.dr.dk
--
option transport-type rdma
option remote-subvolume /bricks/brick1/glu_linux_dr2_oracle
option remote-host glustoretst03.net.dr.dk
--
option transport-type rdma
option remote-subvolume /bricks/brick1/glu_linux_dr2_oracle
option remote-host glustoretst04.net.dr.dk
Regards
Jesper
Fra: Nithya Balachandran [mailto:nbalacha@xxxxxxxxxx]
Sendt: 5. maj 2017 13:16
Til: Jesper Led Lauridsen TS Infra server <JLY@xxxxx>
Cc: gluster-users@xxxxxxxxxxx
Emne: Re: [Gluster-users] Remove-brick failed
On 4 May 2017 at 11:46, Jesper Led Lauridsen TS Infra server <JLY@xxxxx> wrote:
Hi
I'm trying to remove 2 bricks from a Distributed-Replicate without losing data. But it fails in rebalance
Any help is appreciated...
What I do:
# gluster volume remove-brick glu_linux_dr2_oracle replica 2 glustoretst03.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle glustoretst04.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle start
volume remove-brick start: success
ID: c2549eb4-e37a-4f0d-9273-3f7c580e9e80
# gluster volume remove-brick glu_linux_dr2_oracle replica 2 glustoretst03.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle glustoretst04.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle status
Node Rebalanced-files size scanned failures skipped status run time in secs
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
glustoretst04.net.dr.dk 0 0Bytes 0 0 0 failed 0.00
glustoretst03.net.dr.dk 0 0Bytes 0 0 0 failed 0.00
******** log output *******
# cat etc-glusterfs-glusterd.vol.log
[2017-05-03 12:18:59.423867] I [glusterd-handler.c:1296:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2017-05-03 12:20:21.024213] I [glusterd-handler.c:3836:__glusterd_handle_status_volume] 0-management: Received status volume req for volume glu_int_dr2_dalet
[2017-05-03 12:21:10.813956] I [glusterd-handler.c:1296:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2017-05-03 12:22:45.298742] I [glusterd-brick-ops.c:676:__glusterd_handle_remove_brick] 0-management: Received rem brick req
[2017-05-03 12:22:45.298807] I [glusterd-brick-ops.c:722:__glusterd_handle_remove_brick] 0-management: request to change replica-count to 2
[2017-05-03 12:22:45.311705] I [glusterd-utils.c:11549:glusterd_generate_and_set_task_id] 0-management: Generated task-id c2549eb4-e37a-4f0d-9273-3f7c580e9e80 for key remove-brick-id
[2017-05-03 12:22:45.312296] I [glusterd-op-sm.c:5105:glusterd_bricks_select_remove_brick] 0-management: force flag is not set
[2017-05-03 12:22:46.414038] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.419778] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.425132] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.429469] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.433623] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.439089] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.444048] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.448623] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.457386] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.538115] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.542870] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.547325] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.551742] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.555951] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.560725] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.565692] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.570027] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:46.578645] I [glusterd-volgen.c:1177:get_vol_nfs_transport_type] 0-glusterd: The default transport type for tcp,rdma volume is tcp if option is not defined by the user
[2017-05-03 12:22:47.663980] I [glusterd-utils.c:6316:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV3 successfully
[2017-05-03 12:22:47.664372] I [glusterd-utils.c:6321:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV1 successfully
[2017-05-03 12:22:47.664786] I [glusterd-utils.c:6326:glusterd_nfs_pmap_deregister] 0-: De-registered NFSV3 successfully
[2017-05-03 12:22:47.665175] I [glusterd-utils.c:6331:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v4 successfully
[2017-05-03 12:22:47.665559] I [glusterd-utils.c:6336:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v1 successfully
[2017-05-03 12:22:47.665943] I [glusterd-utils.c:6341:glusterd_nfs_pmap_deregister] 0-: De-registered ACL v3 successfully
[2017-05-03 12:22:47.674503] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-05-03 12:22:47.674655] W [socket.c:3004:socket_connect] 0-management: Ignore failed connection attempt on , (No such file or directory)
[2017-05-03 12:22:48.703206] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-05-03 12:22:48.703345] W [socket.c:3004:socket_connect] 0-management: Ignore failed connection attempt on , (No such file or directory)
[2017-05-03 12:22:49.458391] I [mem-pool.c:545:mem_pool_destroy] 0-management: size=588 max=0 total=0
[2017-05-03 12:22:49.458429] I [mem-pool.c:545:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2017-05-03 12:22:49.470431] W [socket.c:620:__socket_rwv] 0-socket.management: writev on
127.0.0.1:985 failed (Broken pipe)
[2017-05-03 12:22:49.470450] I [socket.c:2353:socket_event_handler] 0-transport: disconnecting now
[2017-05-03 12:22:49.470929] W [socket.c:620:__socket_rwv] 0-socket.management: writev on
127.0.0.1:988 failed (Broken pipe)
[2017-05-03 12:22:49.470945] I [socket.c:2353:socket_event_handler] 0-transport: disconnecting now
[2017-05-03 12:22:49.473855] W [socket.c:620:__socket_rwv] 0-management: readv on /var/run/b10c65c880e831b5c91cf638e1c0e0e4.socket failed (Invalid argument)
[2017-05-03 12:22:49.473880] I [MSGID: 106006] [glusterd-handler.c:4290:__glusterd_nodesvc_rpc_notify] 0-management: nfs has disconnected from glusterd.
[2017-05-03 12:22:49.473907] I [mem-pool.c:545:mem_pool_destroy] 0-management: size=588 max=0 total=0
[2017-05-03 12:22:49.473930] I [mem-pool.c:545:mem_pool_destroy] 0-management: size=124 max=0 total=0
[2017-05-03 12:22:49.473986] W [socket.c:620:__socket_rwv] 0-management: readv on /var/run/6a75793fc0c76a2c9e9403f63ff38d99.socket failed (Invalid argument)
[2017-05-03 12:22:49.474003] I [MSGID: 106006] [glusterd-handler.c:4290:__glusterd_nodesvc_rpc_notify] 0-management: glustershd has disconnected from glusterd.
[2017-05-03 12:23:23.106811] E [glusterd-op-sm.c:3603:glusterd_op_ac_send_stage_op] 0-management: Staging of operation 'Volume Rebalance' failed on localhost : remove-brick not started.
[2017-05-03 12:29:22.407630] I [glusterd-handler.c:1296:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2017-05-03 12:29:44.157973] I [glusterd-handler.c:1296:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2017-05-03 12:30:23.522501] I [glusterd-handler.c:1296:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
******** Volume information *******
# gluster volume info glu_linux_dr2_oracle
Volume Name: glu_linux_dr2_oracle
Type: Distributed-Replicate
Volume ID: 3aef9266-0736-45b0-93bb-74248e18e85d
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp,rdma
Bricks:
Brick1: glustoretst01.net.dr.dk:/bricks/brick2/glu_linux_dr2_oracle
Brick2: glustoretst02.net.dr.dk:/bricks/brick2/glu_linux_dr2_oracle
Brick3: glustoretst01.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle
Brick4: glustoretst02.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle
Brick5: glustoretst03.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle
Brick6: glustoretst04.net.dr.dk:/bricks/brick1/glu_linux_dr2_oracle
Options Reconfigured:
features.quota: off
storage.owner-gid: 0
storage.owner-uid: 0
cluster.server-quorum-type: server
cluster.quorum-type: none
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
auth.allow: 10.101.*
user.cifs: disable
nfs.disable: on
cluster.server-quorum-ratio: 50%
******** Gluster Version *******
# rpm -qa | grep glusterfs
glusterfs-3.6.9-1.el6.x86_64
[root@glustertst01 glusterfs]# rpm -qa | grep glusterfs
glusterfs-fuse-3.6.9-1.el6.x86_64
glusterfs-server-3.6.9-1.el6.x86_64
glusterfs-libs-3.6.9-1.el6.x86_64
glusterfs-cli-3.6.9-1.el6.x86_64
glusterfs-api-3.6.9-1.el6.x86_64
glusterfs-3.6.9-1.el6.x86_64
glusterfs-geo-replication-3.6.9-1.el6.x86_64
Regards
Jesper
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users