Hi Rafi,
It’s what I do. But I note particularly this kind of trouble when I mount my volumes manually.
In addition, when I changed my transport-type from tcp or rdma to tcp,rdma, I have had to restart my volume in order they can took effect.
I wonder if these trouble are not due to RDMA protocol… because it looks like more stable with TCP one.
Another idea? Thanks for replying and by advance, Geoffrey
------------------------------------------------------ Geoffrey Letessier Responsable informatique & ingénieur système UPR 9080 - CNRS - Laboratoire de Biochimie Théorique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
On 07/22/2015 04:51 AM, Geoffrey
Letessier wrote:
Hi Niels,
Thanks for replying.
In fact, after having checked the log, I've discovered
GlusterFS tried to connect a brick with a TCP (or RDMA) port
allocated to another volume… (bug?)
For example, here is a extract of my workdir.log file :
[2015-07-21 21:34:01.820188]
E [socket.c:2332:socket_connect_finish]
0-vol_workdir_amd-client-0: connection to 10.0.4.1:49161
failed (Connexion refusée)
[2015-07-21 21:34:01.822563]
E [socket.c:2332:socket_connect_finish]
0-vol_workdir_amd-client-2: connection to 10.0.4.1:49162
failed (Connexion refusée)
But the 2 ports (49161 and 49162) concerned only my
vol_home volume, not the vol_workdir_amd one.
Now, after having restart all glusterd synchronously (pdsh
-w cl-storage[1-4] service glusterd restart), all seems to be
back into a normal situation (size, write permission, etc.)
But, a few minutes later, i note a strange thing I notice
since i’ve upgraded my cluster storage from 3.5.3 to 3.7.2-3:
when I try to mount some volume (particularly my vol_shared
volume (replicated volume)) my system can hang… And, because I
use it in my bashrc file for my environment modules, i need to
restart my node. Idem if I try to do a DF on my mounted volume
(if it doesn’t hang during the mount).
With TCP transport-type, the situation seems to be more
stable..
In addition: If I restart a storage node, I can’t use
Gluster CLI (it also hang).
Do you have an idea?
Are you using bash script to start/mount the volume ? If so, add a
sleep after volume start and mount, to allow all the process to
start properly. Because RDMA protocol will take some time to init
the resources.
Regards
Rafi KC
One more time, thanks a lot for your help,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
On Tue, Jul 21, 2015 at 11:20:20PM
+0200, Geoffrey Letessier wrote:
Hello Soumya, Hello everybody,
network.ping-timeout was set to 42 seconds. I set it to 0
but no
difference. The problem was, after having re-set le
transport-type to
rdma,tcp some brick down after a few minutes.. Despite of
restarting
volumes, after a few minutes, some [other/different] bricks
down
again.
I'm not sure how if the ping-timeout is differently handled
when RDMA is
used. Adding two of the guys that know RDMA well on CC.
Now, after re-creation of my volume,
bricks keep alive but, oddly, i’m
not able to write on my volume. In addition, I defined a
distributed
volume with 2 servers, 4 bricks of 250GB each and my final
volume
seems to be only sized to 500GB… It’s amazing..
As seen further below, the 500GB volume is caused by two
unreachable
bricks. When the bricks are not reachable, the size of the
bricks can
not be detected by the client and therefore 2x 250 GB is
missing.
It is unclear to me why writing to a pure distributed volume
fails. When
a brick is not reachable, and the file should be created
there, it
would normally get created on an other brick. When the brick
that should
have the file gets online, and a new lookup for the file is
done, a so
called "link file" is created, which points to the file on the
other
brick. I guess the failure has to do with the connection
issues, and I
would suggest to get that solved first.
HTH,
Niels
Here you can find some information:
# gluster volume status vol_workdir_amd
Status of volume: vol_workdir_amd
Gluster process TCP Port RDMA
Port Online Pid
------------------------------------------------------------------------------
Brick ib-storage1:/export/brick_workdir/bri
ck1/data 49185 49186
Y 23098
Brick ib-storage3:/export/brick_workdir/bri
ck1/data 49158 49159
Y 3886
Brick ib-storage1:/export/brick_workdir/bri
ck2/data 49187 49188
Y 23117
Brick ib-storage3:/export/brick_workdir/bri
ck2/data 49160 49161
Y 3905
# gluster volume info vol_workdir_amd
Volume Name: vol_workdir_amd
Type: Distribute
Volume ID: 087d26ea-c6df-4cbe-94af-ecd87b59aedb
Status: Started
Number of Bricks: 4
Transport-type: tcp,rdma
Bricks:
Brick1: ib-storage1:/export/brick_workdir/brick1/data
Brick2: ib-storage3:/export/brick_workdir/brick1/data
Brick3: ib-storage1:/export/brick_workdir/brick2/data
Brick4: ib-storage3:/export/brick_workdir/brick2/data
Options Reconfigured:
performance.readdir-ahead: on
# pdsh -w storage[1,3] df -h
/export/brick_workdir/brick{1,2}
storage3: Filesystem Size Used Avail Use%
Mounted on
storage3: /dev/mapper/st--block1-blk1--workdir
storage3: 250G 34M 250G 1%
/export/brick_workdir/brick1
storage3: /dev/mapper/st--block2-blk2--workdir
storage3: 250G 34M 250G 1%
/export/brick_workdir/brick2
storage1: Filesystem Size Used Avail Use%
Mounted on
storage1: /dev/mapper/st--block1-blk1--workdir
storage1: 250G 33M 250G 1%
/export/brick_workdir/brick1
storage1: /dev/mapper/st--block2-blk2--workdir
storage1: 250G 33M 250G 1%
/export/brick_workdir/brick2
# df -h /workdir/
Filesystem Size Used Avail Use% Mounted on
localhost:vol_workdir_amd.rdma
500G 67M 500G 1% /workdir
# touch /workdir/test
touch: impossible de faire un touch « /workdir/test »: Aucun
fichier ou dossier de ce type
# tail -30l /var/log/glusterfs/workdir.log
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:33.927673] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020
peer:10.0.4.1:49174)
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:37.877231] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
[2015-07-21 21:10:37.880556] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
[2015-07-21 21:10:37.914661] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1021
peer:10.0.4.1:49173)
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:37.923535] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020
peer:10.0.4.1:49174)
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:41.883925] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
[2015-07-21 21:10:41.887085] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
[2015-07-21 21:10:41.919394] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1021
peer:10.0.4.1:49173)
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:41.932622] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020
peer:10.0.4.1:49174)
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:44.682636] W
[dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht:
no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.682947] W
[dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht:
no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683240] W
[dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht:
no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683472] W
[dht-diskusage.c:48:dht_du_info_cbk] 0-vol_workdir_amd-dht:
failed to get disk info from vol_workdir_amd-client-0
[2015-07-21 21:10:44.683506] W
[dht-diskusage.c:48:dht_du_info_cbk] 0-vol_workdir_amd-dht:
failed to get disk info from vol_workdir_amd-client-2
[2015-07-21 21:10:44.683532] W
[dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht:
no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683551] W
[fuse-bridge.c:1970:fuse_create_cbk] 0-glusterfs-fuse: 18:
/test => -1 (Aucun fichier ou dossier de ce type)
[2015-07-21 21:10:44.683619] W
[dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht:
no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:44.683846] W
[dht-layout.c:189:dht_layout_search] 0-vol_workdir_amd-dht:
no subvolume for hash (value) = 1072520554
[2015-07-21 21:10:45.886807] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to 49173 (from 0)
[2015-07-21 21:10:45.893059] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to 49174 (from 0)
[2015-07-21 21:10:45.920434] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1021
peer:10.0.4.1:49173)
Host Unreachable, Check your connection with IPoIB
[2015-07-21 21:10:45.925292] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8 (me:10.0.4.1:1020
peer:10.0.4.1:49174)
Host Unreachable, Check your connection with IPoIB
I use GlusterFS in production since around 3 years without
any block
problem but now the situation is awesome since more than 3
weeks…
Indeed, our production are down since roughly 3.5 weeks
(with a lot
and different problems with GlusterFS v3.5.3 and now with
3.7.2-3) and
i need to restart it…
Thanks in advance,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Le 21 juil. 2015 à 19:36, Soumya Koduri <skoduri@xxxxxxxxxx>
a écrit :
From the following errors,
[2015-07-21 14:36:30.495321] I [MSGID: 114020]
[client.c:2118:notify] 0-vol_shared-client-0: parent
translators are ready, attempting connect on transport
[2015-07-21 14:36:30.498989] W
[socket.c:923:__socket_keepalive] 0-socket: failed to set
TCP_USER_TIMEOUT 0 on socket 12, Protocole non disponible
[2015-07-21 14:36:30.499004] E
[socket.c:3015:socket_connect] 0-vol_shared-client-0:
Failed to set keep-alive: Protocole non disponible
looks like setting TCP_USER_TIMEOUT value to 0 on the
socket failed with error (IIUC) "Protocol not available".
Could you check if 'network.ping-timeout' is set to zero
for that volume using 'gluster volume info'? Anyways from
the code looks like 'TCP_USER_TIMEOUT' can take value
zero. Not sure why it has failed.
Niels, any thoughts?
Thanks,
Soumya
On 07/21/2015 08:15 PM, Geoffrey Letessier wrote:
[2015-07-21 14:36:30.495321] I
[MSGID: 114020] [client.c:2118:notify]
0-vol_shared-client-0: parent translators are ready,
attempting connect
on transport
[2015-07-21 14:36:30.498989] W
[socket.c:923:__socket_keepalive]
0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 12,
Protocole non
disponible
[2015-07-21 14:36:30.499004] E
[socket.c:3015:socket_connect]
0-vol_shared-client-0: Failed to set keep-alive:
Protocole non disponible
|