On 07/22/2015 12:55 PM, Geoffrey
Letessier wrote:
Concerning the hang, I just saw this only once with TCP protocol
but, actually, RDMA seems to be in cause.
If you are mounting a tcp,rdma volume using tcp protocol, all the
communication will go through the tcp connection and rdma won't come
in between client and server.
… And, after a moment (a few minutes after having
restarted my back-transfert of around 40TB), my volume fall down
(and all my rsync too):
[root@atlas
~]# df -h /mnt
df:
« /mnt »: Noeud final de transport n'est pas connecté
df:
aucun système de fichiers traité
aka "transport endpoint is not connected »
Can you sent me the following details , if possible, ?
1) mount command used, 2) volume status 3) Client, brick logs
Regards
Rafi KC
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Hi Rafi,
It’s what I do. But I note particularly this kind of
trouble when I mount my volumes manually.
In addition, when I changed my transport-type from
tcp or rdma to tcp,rdma, I have had to restart my volume
in order they can took effect.
I wonder if these trouble are not due to RDMA
protocol… because it looks like more stable with TCP
one.
Another idea?
Thanks for replying and by advance,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
On 07/22/2015 04:51 AM,
Geoffrey Letessier wrote:
Hi Niels,
Thanks for replying.
In fact, after having checked the log, I've
discovered GlusterFS tried to connect a brick
with a TCP (or RDMA) port allocated to another
volume… (bug?)
For example, here is a extract of my
workdir.log file :
[2015-07-21
21:34:01.820188] E
[socket.c:2332:socket_connect_finish]
0-vol_workdir_amd-client-0: connection to
10.0.4.1:49161 failed (Connexion refusée)
[2015-07-21
21:34:01.822563] E
[socket.c:2332:socket_connect_finish]
0-vol_workdir_amd-client-2: connection to
10.0.4.1:49162 failed (Connexion refusée)
But the 2 ports (49161 and 49162) concerned
only my vol_home volume, not the
vol_workdir_amd one.
Now, after having restart all glusterd
synchronously (pdsh -w cl-storage[1-4] service
glusterd restart), all seems to be back into a
normal situation (size, write permission,
etc.)
But, a few minutes later, i note a strange
thing I notice since i’ve upgraded my cluster
storage from 3.5.3 to 3.7.2-3: when I try to
mount some volume (particularly my vol_shared
volume (replicated volume)) my system can
hang… And, because I use it in my bashrc file
for my environment modules, i need to restart
my node. Idem if I try to do a DF on my
mounted volume (if it doesn’t hang during the
mount).
With TCP transport-type, the situation
seems to be more stable..
In addition: If I restart a storage node, I
can’t use Gluster CLI (it also hang).
Do you have an idea?
Are you using bash script to start/mount the volume
? If so, add a sleep after volume start and mount,
to allow all the process to start properly. Because
RDMA protocol will take some time to init the
resources.
Regards
Rafi KC
One more time, thanks a lot for your help,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie
Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
On Tue, Jul 21, 2015 at
11:20:20PM +0200, Geoffrey Letessier wrote:
Hello Soumya, Hello
everybody,
network.ping-timeout was set to 42 seconds.
I set it to 0 but no
difference. The problem was, after having
re-set le transport-type to
rdma,tcp some brick down after a few
minutes.. Despite of restarting
volumes, after a few minutes, some
[other/different] bricks down
again.
I'm not sure how if the ping-timeout is
differently handled when RDMA is
used. Adding two of the guys that know RDMA
well on CC.
Now, after re-creation
of my volume, bricks keep alive but, oddly,
i’m
not able to write on my volume. In addition,
I defined a distributed
volume with 2 servers, 4 bricks of 250GB
each and my final volume
seems to be only sized to 500GB… It’s
amazing..
As seen further below, the 500GB volume is
caused by two unreachable
bricks. When the bricks are not reachable, the
size of the bricks can
not be detected by the client and therefore 2x
250 GB is missing.
It is unclear to me why writing to a pure
distributed volume fails. When
a brick is not reachable, and the file should
be created there, it
would normally get created on an other brick.
When the brick that should
have the file gets online, and a new lookup
for the file is done, a so
called "link file" is created, which points to
the file on the other
brick. I guess the failure has to do with the
connection issues, and I
would suggest to get that solved first.
HTH,
Niels
Here you can find some
information:
# gluster volume status vol_workdir_amd
Status of volume: vol_workdir_amd
Gluster process
TCP Port RDMA
Port Online Pid
------------------------------------------------------------------------------
Brick ib-storage1:/export/brick_workdir/bri
ck1/data
49185
49186 Y 23098
Brick ib-storage3:/export/brick_workdir/bri
ck1/data
49158
49159 Y 3886
Brick ib-storage1:/export/brick_workdir/bri
ck2/data
49187
49188 Y 23117
Brick ib-storage3:/export/brick_workdir/bri
ck2/data
49160
49161 Y 3905
# gluster volume info vol_workdir_amd
Volume Name: vol_workdir_amd
Type: Distribute
Volume ID:
087d26ea-c6df-4cbe-94af-ecd87b59aedb
Status: Started
Number of Bricks: 4
Transport-type: tcp,rdma
Bricks:
Brick1:
ib-storage1:/export/brick_workdir/brick1/data
Brick2:
ib-storage3:/export/brick_workdir/brick1/data
Brick3:
ib-storage1:/export/brick_workdir/brick2/data
Brick4:
ib-storage3:/export/brick_workdir/brick2/data
Options Reconfigured:
performance.readdir-ahead: on
# pdsh -w storage[1,3] df -h
/export/brick_workdir/brick{1,2}
storage3: Filesystem Size Used
Avail Use% Mounted on
storage3:
/dev/mapper/st--block1-blk1--workdir
storage3: 250G 34M
250G 1% /export/brick_workdir/brick1
storage3:
/dev/mapper/st--block2-blk2--workdir
storage3: 250G 34M
250G 1% /export/brick_workdir/brick2
storage1: Filesystem Size Used
Avail Use% Mounted on
storage1:
/dev/mapper/st--block1-blk1--workdir
storage1: 250G 33M
250G 1% /export/brick_workdir/brick1
storage1:
/dev/mapper/st--block2-blk2--workdir
storage1: 250G 33M
250G 1% /export/brick_workdir/brick2
# df -h /workdir/
Filesystem Size Used Avail Use%
Mounted on
localhost:vol_workdir_amd.rdma
500G 67M 500G 1%
/workdir
# touch /workdir/test
touch: impossible de faire un touch «
/workdir/test »: Aucun fichier ou dossier de
ce type
# tail -30l /var/log/glusterfs/workdir.log
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:33.927673] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:37.877231] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to
49173 (from 0)
[2015-07-21 21:10:37.880556] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to
49174 (from 0)
[2015-07-21 21:10:37.914661] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:37.923535] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:41.883925] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to
49173 (from 0)
[2015-07-21 21:10:41.887085] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to
49174 (from 0)
[2015-07-21 21:10:41.919394] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:41.932622] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:44.682636] W
[dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash
(value) = 1072520554
[2015-07-21 21:10:44.682947] W
[dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash
(value) = 1072520554
[2015-07-21 21:10:44.683240] W
[dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash
(value) = 1072520554
[2015-07-21 21:10:44.683472] W
[dht-diskusage.c:48:dht_du_info_cbk]
0-vol_workdir_amd-dht: failed to get disk
info from vol_workdir_amd-client-0
[2015-07-21 21:10:44.683506] W
[dht-diskusage.c:48:dht_du_info_cbk]
0-vol_workdir_amd-dht: failed to get disk
info from vol_workdir_amd-client-2
[2015-07-21 21:10:44.683532] W
[dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash
(value) = 1072520554
[2015-07-21 21:10:44.683551] W
[fuse-bridge.c:1970:fuse_create_cbk]
0-glusterfs-fuse: 18: /test => -1 (Aucun
fichier ou dossier de ce type)
[2015-07-21 21:10:44.683619] W
[dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash
(value) = 1072520554
[2015-07-21 21:10:44.683846] W
[dht-layout.c:189:dht_layout_search]
0-vol_workdir_amd-dht: no subvolume for hash
(value) = 1072520554
[2015-07-21 21:10:45.886807] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-0: changing port to
49173 (from 0)
[2015-07-21 21:10:45.893059] I
[rpc-clnt.c:1819:rpc_clnt_reconfig]
0-vol_workdir_amd-client-2: changing port to
49174 (from 0)
[2015-07-21 21:10:45.920434] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-0: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1021 peer:10.0.4.1:49173)
Host Unreachable, Check your connection with
IPoIB
[2015-07-21 21:10:45.925292] W
[rdma.c:1263:gf_rdma_cm_event_handler]
0-vol_workdir_amd-client-2: cma event
RDMA_CM_EVENT_REJECTED, error 8
(me:10.0.4.1:1020 peer:10.0.4.1:49174)
Host Unreachable, Check your connection with
IPoIB
I use GlusterFS in production since around 3
years without any block
problem but now the situation is awesome
since more than 3 weeks…
Indeed, our production are down since
roughly 3.5 weeks (with a lot
and different problems with GlusterFS v3.5.3
and now with 3.7.2-3) and
i need to restart it…
Thanks in advance,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur
système
UPR 9080 - CNRS - Laboratoire de Biochimie
Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx
Le 21 juil. 2015 à 19:36, Soumya Koduri <skoduri@xxxxxxxxxx>
a écrit :
From the following
errors,
[2015-07-21 14:36:30.495321] I [MSGID:
114020] [client.c:2118:notify]
0-vol_shared-client-0: parent translators
are ready, attempting connect on transport
[2015-07-21 14:36:30.498989] W
[socket.c:923:__socket_keepalive]
0-socket: failed to set TCP_USER_TIMEOUT 0
on socket 12, Protocole non disponible
[2015-07-21 14:36:30.499004] E
[socket.c:3015:socket_connect]
0-vol_shared-client-0: Failed to set
keep-alive: Protocole non disponible
looks like setting TCP_USER_TIMEOUT value
to 0 on the socket failed with error
(IIUC) "Protocol not available".
Could you check if 'network.ping-timeout'
is set to zero for that volume using
'gluster volume info'? Anyways from the
code looks like 'TCP_USER_TIMEOUT' can
take value zero. Not sure why it has
failed.
Niels, any thoughts?
Thanks,
Soumya
On 07/21/2015 08:15 PM, Geoffrey Letessier
wrote:
[2015-07-21
14:36:30.495321] I [MSGID: 114020]
[client.c:2118:notify]
0-vol_shared-client-0: parent
translators are ready, attempting
connect
on transport
[2015-07-21 14:36:30.498989] W
[socket.c:923:__socket_keepalive]
0-socket: failed to set TCP_USER_TIMEOUT
0 on socket 12, Protocole non
disponible
[2015-07-21 14:36:30.499004] E
[socket.c:3015:socket_connect]
0-vol_shared-client-0: Failed to set
keep-alive: Protocole non disponible
|