Gluster 3.4.0 RDMA stops working with more then a small handful of nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I was wondering if anyone on this list has run into this problem. When creating/mounting RDMA volumes of ~half dozen or less nodes - I am able to successfully create, start, and mount these RDMA only volumes. 

However if I try to scale this to 20, 50, or even 100 nodes RDMA only volumes completely fall over on themselves. Some of the basic symptoms I'm seeing are:

* Volume create always completes successfully, however when you go to start the node it will report failure - only to have the volume info command for that volume state that they are started
* Attempting to mount this "started" volume results in a failure to mount/hanging at the mount process
* Attempting to stop this "started" volume results in a failure, with no error/success reported (the command simply times out and gives an empty status result)
* Attempting to delete this "started" volume also results in failure - without any status of error/success reported

In order to clear the state, I have to stop/killall gluster processes, then respawn them. After this is completed the volume info command still shows the volume as started, however I can now successfully stop/delete the volume with a status of success:

root at cs1-p:~# gluster volume stop perftest
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: perftest: success

Volume Name: perftest
Type: Distributed-Replicate
Volume ID: ef206a76-7b26-4c12-9ccf-b3d250f36403
Status: Stopped
Number of Bricks: 50 x 2 = 100
Transport-type: rdma

root at cs1-p:~# gluster volume delete perftest
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: perftest: success

If there is a known workaround to this - please let me know - until then: https://bugzilla.redhat.com/show_bug.cgi?id=985424


 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130717/6e1bfd35/attachment.html>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux