On 07/16/2012 09:16 AM, Philippe Muller wrote: > Hi RedHat & GlusterFS users, > > Last week-end, I worked on a GlusterFS cluster upgrade, from 3.0.3 to > 3.3.0. > We were using hand-made volume files defining 2 volumes, a distributed > one, and a replicated-distribute one; both using the "transport-type > ib-verbs" option. > > One of our objectives was to use the "gluster" CLI tool (which doesn't > existed in 3.0.3 - from what I remember). > > Here is what we did: > 1 - Shutdown all glusterfs instances > 2 - Install the Gluster 3.3.0 > 3 - Start glusterd on all hosts > 4 - Create a trusted pool with all our hosts > 5 - Create "compatible volumes" using the CLI tool; using the same > bricks we were using with our hand-made volfiles and using the "rdma" > transport (since ib-verbs was no longer an option...) > 6 - Mount the volumes > > Of course, we tested that scenario on VMs. No issues with data. We > tested everything except.... RDMA ! > > When we finally made the upgrade, everything went fine, except > mounting the volumes. We got this kind of error messages in the log files: > "E [rdma.c:4458:tcp_connect_finish] 0-zodiac-client-3: tcp connect to > failed (Connection refused)" > (notice the 2 white spaces between "connect to" and "failed") > That reminded me of an issue when we had a problem with the subnet manager running on the IB switch. But this time, the switch wasn't responsible; IPoIB was still running fine... > > > > I scratched my head more than once, thinking about what I could possibly have forgotten. Then I searched for all information I could find about RDMA and 3.3.0. > > Here is what I found: > - On page 123 of the "GlusterFS Administration Guide 3.3.0", a small note saying: "NOTE: with 3.3.0 release, transport type 'rdma' and 'tcp,rdma' are not fully supported." > > > - On July 7, Ling Ho started a thread on this mailing-list, with very similar symptoms:http://www.mail-archive.com/gluster-users at gluster.org/msg09326.html ; but he doesn't got any answer. > > > > In the upgrade urgency, we weren't sure rollbacking to 3.0.3 was a good option (since we don't precisely known what XFS attributes were modified by 3.3.0 on the backend FS). So we switched to TCP (over IPoIB). > > > It's working. We are now running 3.3.0. But we are no longer taking advantage of RDMA. > > So here are a few questions: > - Did I missed something that prevented me to use RDMA in 3.3.0 ? > - Is there a way to use RDMA in 3.3.0 ? > > > - Is there any official communication about the 3.3.0 RDMA issue ? > - Is there a 3.3.x release with RDMA support planned ? For when ? > - Will the RDMA transport be dropped in future releases ? > > Thanks ! > (and yeah, despite that issue, I still love GlusterFS :-) > > > Philippe Muller I just came back from one week vacation. Yes, I didn't get any reply from the list, and were not able to get RDMA working when the server is configured for tcp,rdma. When I was doing testing, I had set up the server using rdma only and totally missed this. I ended up using tcp with ipoverib. The performance is much better than tcp over 10G/s. However, since I am in a mix environment, and my I have to do some static routing on the gluster server. Basically routing the ipoverib subnet to the 10G/s subnet which the bricks are all set up with. Things have been working fine. ... ling -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20120723/2bfd2934/attachment.htm>