I tried downgrading to 3.2 and the rdma connection works, with substantially improved performance in some cases over 3.3 using IP over IB. Is there a version of 3.3 that anyone is using in production with rdma that I can try? Iain On 05/30/13 09:24, Iain Buchanan wrote: > I'm now able to connect to volumes using rdma when they are created > using "tcp,rdma", but it appears that the data is still being > transferred over ethernet. If I run "ifstat" while running iozone I > can see a lot of data being moved through the ethernet adapter > (nothing else is on the box) and the performance is basically > identical to plain ethernet. When I create the volume with just > "rdma" I can't even mount the volume again (see error below). > > I've modified all the volume files, adding the > transport.rdma.listen-port lines after completing set-up (anywhere > there is a "option transport-type rdma" I've added this line - quite a > few places). > > volume create storage transport tcp,rdma my_server1:/data/area > volume add-brick storage replica 2 my_server2:/data/area > volume start storage > mount -t glusterfs my_server:storage.rdma /mnt/storage > > I'm now back to the error I had before the workaround on tcp,rdma - > the lines are in the config files, and I've restarted the service. I > can see my alterations in the config dumped into the log, but then it > returns to the original errors: > > [2013-05-30 09:13:41.871408] E [rdma.c:4604:tcp_connect_finish] > 0-storage-client-0: tcp connect to failed (Connection refused) > [2013-05-30 09:13:41.871467] W [rdma.c:4187:gf_rdma_disconnect] > (-->/usr/sbin/glusterfs(main+0x34d) [0x7f563a24c3ed] > (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f5639de5d17] > (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5231) > [0x7f5634012231]))) 0-storage-client-0: disconnect called (peer:) > > I saw a message on the mailing list that seems to suggest plain "rdma" > doesn't work in the 3.3 series - is this correct? > http://www.gluster.org/pipermail/gluster-users/2013-January/035115.html > > (I can run ib_read_bw etc. between the two servers without problems.) > > Iain > > > On 05/30/13 08:11, Iain Buchanan wrote: >> Just to confirm - putting the line "option transport.rdma.listen-port >> 24008" into the first "volume" block in the two files with "rdma" in >> their names under /var/lib/glusterd/vols/<volumename> seems to have >> fixed the issue. I'm now able to mount and I can run iozone on a >> single node. I'll give it a go with two nodes and see if rdma makes >> any difference. >> >> Thanks for your help Joe! >> >> Iain >> >> >> On 05/30/13 07:59, Joe Julian wrote: >>> On 05/29/2013 11:42 PM, Iain Buchanan wrote: >>>> Thanks Joe, >>>> >>>> I tried mounting with the ".rdma" suffix after creating using >>>> "transport tcp,rdma" and I get the same "tcp connect to failed" >>>> messages in the log - I'm using the version from the semiosis PPA >>>> at https://launchpad.net/~semiosis/+archive/ubuntu-glusterfs-3.3 >>>> >>>> Looking at the dates on there I don't think it includes this fix. >>>> I've sent the maintainer a message asking if they could update it. >>>> In the meantime would Niels de Vos' workaround work? (Setting >>>> transport.rdma.listen-port to 24008 in the glusterd.vol?) >>> >>> Yes, I had someone else try that and it worked. Remember, any >>> changes made to the volume through the CLI will reset that >>> configuration, causing it to stop working until you edit again. >>> >>>> >>>> Iain >>>> >>>> On 05/30/13 07:06, Joe Julian wrote: >>>>> On 05/29/2013 10:33 PM, Iain Buchanan wrote: >>>>>> Hi, >>>>>> >>>>>> I'm running GlusterFS 3.3.1-ubuntu1~precise9 and I'm having some >>>>>> problems with the "rdma" and "tcp,rdma" options I hope someone >>>>>> can help me with. >>>>>> >>>>>> 1. What does "tcp,rdma" actually do - does it let you mix both >>>>>> types of client? (I did a few tests with iozone and found it >>>>>> gave identical performance to the "tcp".) >>>>>> >>>>>> 2. I can't get "rdma" to work, even in the simplest case with a >>>>>> single node. >>>>>> volume create storage transport transport rdma my_server:/data/area >>>>>> volume start storage >>>>>> mount -t glusterfs my_server:storage /mnt/storage >>>>>> >>>>>> The last line hangs. Looking in /var/log/glusterfs I can see the >>>>>> log for the volume: >>>>>> >>>>>> [2013-05-30 06:24:19.605315] E [rdma.c:4604:tcp_connect_finish] >>>>>> 0-storage-client-0: *tcp connect to failed (Connection refused)* >>>>>> [2013-05-30 06:24:19.605713] W [rdma.c:4187:gf_rdma_disconnect] >>>>>> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f374d38a3ed] >>>>>> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f374cf23d17] >>>>>> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5231) >>>>>> [0x7f3743398231]))) 0-storage-client-0: disconnect called (peer:) >>>>>> [2013-05-30 06:24:19.605763] W >>>>>> [rdma.c:4521:gf_rdma_handshake_pollerr] >>>>>> (-->/usr/sbin/glusterfs(main+0x34d) [0x7f374d38a3ed] >>>>>> (-->/usr/lib/libglusterfs.so.0(+0x3bd17) [0x7f374cf23d17] >>>>>> (-->/usr/lib/glusterfs/3.3.1/rpc-transport/rdma.so(+0x5150) >>>>>> [0x7f3743398150]))) 0-rpc-transport/rdma: storage-client-0: peer >>>>>> () disconnected, cleaning up >>>>>> >>>>>> This block repeats every few seconds - the line "tcp connect to >>>>>> failed" looks like it has lost the server name somehow? >>>>>> >>>>>> Iain >>>>>> >>>>> If you've installed from the yum repo (http://goo.gl/s077x) that >>>>> shouldn't be happening. kkeithley applied the patch. If not, >>>>> rdma's broken in 3.3.[01]. >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=849122 >>>>> >>>>> To mount via rdma when using tcp,rdma, mount -t glusterfs >>>>> server1:myvol.rdma /mnt/foo >>>>> >>>>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130530/bbbf3679/attachment.html>