"tcp connect to failed" messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I tried downgrading to 3.2 and the rdma connection works, with 
substantially improved performance in some cases over 3.3 using IP over IB.

Is there a version of 3.3 that anyone is using in production with rdma 
that I can try?

Iain


On 05/30/13 09:24, Iain Buchanan wrote:
> I'm now able to connect to volumes using rdma when they are created 
> using "tcp,rdma", but it appears that the data is still being 
> transferred over ethernet. If I run "ifstat" while running iozone I 
> can see a lot of data being moved through the ethernet adapter 
> (nothing else is on the box) and the performance is basically 
> identical to plain ethernet.  When I create the volume with just 
> "rdma" I can't even mount the volume again (see error below).
>
> I've modified all the volume files, adding the 
> transport.rdma.listen-port lines after completing set-up (anywhere 
> there is a "option transport-type rdma" I've added this line - quite a 
> few places).
>
>     volume create storage transport tcp,rdma my_server1:/data/area
>     volume add-brick storage replica 2 my_server2:/data/area
>     volume start storage
>     mount -t glusterfs my_server:storage.rdma /mnt/storage
>
> I'm now back to the error I had before the workaround on tcp,rdma - 
> the lines are in the config files, and I've restarted the service.  I 
> can see my alterations in the config dumped into the log, but then it 
> returns to the original errors:
>
> [2013-05-30 09:13:41.871408] E [rdma.c:4604:tcp_connect_finish] 
> 0-storage-client-0: tcp connect to  failed (Connection refused)
> [2013-05-30 09:13:41.871467] W [rdma.c:4187:gf_rdma_disconnect] 
> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f563a24c3ed] 
> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f5639de5d17] 
> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5231) 
> [0x7f5634012231]))) 0-storage-client-0: disconnect called (peer:)
>
> I saw a message on the mailing list that seems to suggest plain "rdma" 
> doesn't work in the 3.3 series - is this correct?
> http://www.gluster.org/pipermail/gluster-users/2013-January/035115.html
>
> (I can run ib_read_bw etc. between the two servers without problems.)
>
> Iain
>
>
> On 05/30/13 08:11, Iain Buchanan wrote:
>> Just to confirm - putting the line "option transport.rdma.listen-port 
>> 24008" into the first "volume" block in the two files with "rdma" in 
>> their names under /var/lib/glusterd/vols/<volumename> seems to have 
>> fixed the issue.  I'm now able to mount and I can run iozone on a 
>> single node.  I'll give it a go with two nodes and see if rdma makes 
>> any difference.
>>
>> Thanks for your help Joe!
>>
>> Iain
>>
>>
>> On 05/30/13 07:59, Joe Julian wrote:
>>> On 05/29/2013 11:42 PM, Iain Buchanan wrote:
>>>> Thanks Joe,
>>>>
>>>> I tried mounting with the ".rdma" suffix after creating using 
>>>> "transport tcp,rdma" and I get the same "tcp connect to  failed" 
>>>> messages in the log - I'm using the version from the semiosis PPA 
>>>> at https://launchpad.net/~semiosis/+archive/ubuntu-glusterfs-3.3
>>>>
>>>> Looking at the dates on there I don't think it includes this fix.  
>>>> I've sent the maintainer a message asking if they could update it.  
>>>> In the meantime would Niels de Vos' workaround work?  (Setting 
>>>> transport.rdma.listen-port to 24008 in the glusterd.vol?)
>>>
>>> Yes, I had someone else try that and it worked. Remember, any 
>>> changes made to the volume through the CLI will reset that 
>>> configuration, causing it to stop working until you edit again.
>>>
>>>>
>>>> Iain
>>>>
>>>> On 05/30/13 07:06, Joe Julian wrote:
>>>>> On 05/29/2013 10:33 PM, Iain Buchanan wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm running GlusterFS 3.3.1-ubuntu1~precise9 and I'm having some 
>>>>>> problems with the "rdma" and "tcp,rdma" options I hope someone 
>>>>>> can help me with.
>>>>>>
>>>>>> 1. What does "tcp,rdma" actually do - does it let you mix both 
>>>>>> types of client?  (I did a few tests with iozone and found it 
>>>>>> gave identical performance to the "tcp".)
>>>>>>
>>>>>> 2. I can't get "rdma" to work, even in the simplest case with a 
>>>>>> single node.
>>>>>> volume create storage transport transport rdma my_server:/data/area
>>>>>> volume start storage
>>>>>> mount -t glusterfs my_server:storage /mnt/storage
>>>>>>
>>>>>> The last line hangs.  Looking in /var/log/glusterfs I can see the 
>>>>>> log for the volume:
>>>>>>
>>>>>> [2013-05-30 06:24:19.605315] E [rdma.c:4604:tcp_connect_finish] 
>>>>>> 0-storage-client-0: *tcp connect to  failed (Connection refused)*
>>>>>> [2013-05-30 06:24:19.605713] W [rdma.c:4187:gf_rdma_disconnect] 
>>>>>> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f374d38a3ed] 
>>>>>> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f374cf23d17] 
>>>>>> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5231) 
>>>>>> [0x7f3743398231]))) 0-storage-client-0: disconnect called (peer:)
>>>>>> [2013-05-30 06:24:19.605763] W 
>>>>>> [rdma.c:4521:gf_rdma_handshake_pollerr] 
>>>>>> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f374d38a3ed] 
>>>>>> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f374cf23d17] 
>>>>>> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5150) 
>>>>>> [0x7f3743398150]))) 0-rpc-transport/rdma: storage-client-0: peer 
>>>>>> () disconnected, cleaning up
>>>>>>
>>>>>> This block repeats every few seconds - the line "tcp connect to 
>>>>>>  failed" looks like it has lost the server name somehow?
>>>>>>
>>>>>> Iain
>>>>>>
>>>>> If you've installed from the yum repo (http://goo.gl/s077x) that 
>>>>> shouldn't be happening. kkeithley applied the patch. If not, 
>>>>> rdma's broken in 3.3.[01]. 
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=849122
>>>>>
>>>>> To mount via rdma when using tcp,rdma, mount -t glusterfs 
>>>>> server1:myvol.rdma /mnt/foo
>>>>>
>>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130530/bbbf3679/attachment.html>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux