df hang while a brick server down (rdma transport)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,
We are using glusterfs on a cluster of some servers, connecting with Infiniband. 
While using rdma, if one of these servers is down, "mount" is fine but commands such as "df" with hang.


This is the steps to reproduce.
 - create a volume of  3 bricks with rdma transport, each brick on a different server
 - start the volume
 - down a brick server
 - after mount the volume, "df -h" will hang


We have tested on glusterfs3.2.5&3.2.7, all have this problem.


Thanks for any help.
Let me know if there's anything else I can provide!


Here is a piece of glusterfs log:
......
[2013-10-22 11:07:41.335497] I [client-handshake.c:1090:select_server_supported_programs] 0-hash-01-client-0: Using Program GlusterFS 3.0.0, Num (1298437), Version (310)
[2013-10-22 11:07:41.335618] I [client-handshake.c:1090:select_server_supported_programs] 0-hash-01-client-2: Using Program GlusterFS 3.0.0, Num (1298437), Version (310)
[2013-10-22 11:07:41.335995] I [client-handshake.c:913:client_setvolume_cbk] 0-hash-01-client-0: Connected to 192.168.20.107:24013, attached to remote volume '/data/brick1'.
[2013-10-22 11:07:41.336119] I [client-handshake.c:913:client_setvolume_cbk] 0-hash-01-client-2: Connected to 192.168.20.108:24014, attached to remote volume '/data/brick1'.
[2013-10-22 11:07:41.591835] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:41.591950] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:41.591996] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:44.592810] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:44.592917] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
[2013-10-22 11:07:44.592963] E [rdma.c:4417:tcp_connect_finish] 0-hash-01-client-1: tcp connect to  failed (No route to host)
(loop)
......


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131107/cbfa770e/attachment.html>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux