Some commits into RDMA to improve stability has gone in post 3.1.1. Can you check if 3.1.2 has those issues as well? Avati On Sun, Feb 6, 2011 at 10:35 AM, Claudio Baeza Retamal < claudio at dim.uchile.cl> wrote: > Dear friends, > > I have several problems of stability, reliability in a small-middle sized > cluster, my configuration is the following: > > 66 compute nodes (IBM idataplex, X5550, 24 GB RAM) > 1 access node (front end) > 1 master node (queue manager and monotoring) > 2 server for I/O with GlusterFS configured in distributed mode (4 TB in > total) > > All computer have a Mellanox ConnectX QDR (40 Gbps) dual port > 1 Switch Qlogic 12800-180, 7 leaf of 24 ports each one and two double > Spines > QSFP plug > > Centos 5.5 and Xcat as cluster manager > Ofed 1.5.1 > Gluster 3.1.1 over inbiniband > > When the cluster is full loaded for applications which use heavily MPI in > combination with other application which uses a lot of I/O to file system, > GlusterFS do not work anymore. > Also, when gendb uses interproscan bioinformatic applications with 128 o > more jobs, GlusterFS death or disconnects clients randomly, so, some > applicatios become shutdown due they do not see the file system. > > This do not happen with Gluster over tcp (ethernet 1 Gbps) and neither > happen with Lustre 1.8.5 over infiniband, under same conditions Lustre work > fine. > > My question is, exist any documenation where there are information more > especific for GlusterFS tuning? > > Only I found basic information for configuring Gluster, but I do no have > information more deep (i.e. for experts), I think must exist some option > for manipulate this siuation on GlusterFS, moreover, other people should > have the same problems, since we replicate > the configuration in other site with the same results. > Perhaps, the question is about the gluster scalability, how many clients > is recommended for each gluster server when I use RDMA and infiniband fabric > at 40 Gbps? > > I would appreciate any help, I want to use Gluster, but stability and > reliability is very important for us. Perhaps > > > claudio > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >