Gluster 3.1.1 issues over RDMA and HPC environment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear friends,

I have several problems of stability, reliability in a small-middle 
sized cluster, my configuration is the following:

66 compute nodes (IBM idataplex, X5550, 24 GB RAM)
1 access node (front end)
1 master node (queue manager and monotoring)
2 server for I/O with GlusterFS configured in distributed mode (4 TB in 
total)

All computer have a Mellanox ConnectX QDR (40 Gbps) dual port
1 Switch Qlogic 12800-180, 7 leaf of 24 ports each one and two double Spines
QSFP plug

Centos 5.5 and Xcat as cluster manager
Ofed 1.5.1
Gluster 3.1.1 over inbiniband

When the cluster is full loaded for applications which use heavily  MPI 
in combination with other application which uses a lot of I/O to file 
system,  GlusterFS do not work anymore.
Also, when gendb uses interproscan bioinformatic applications with 128 o 
more jobs, GlusterFS death  or disconnects clients randomly, so, some 
applicatios become shutdown due they do not see the file system.

This do not happen with Gluster over tcp (ethernet 1 Gbps)  and neither 
happen with Lustre 1.8.5 over infiniband, under same conditions Lustre 
work fine.

My question is, exist any documenation where there are information more 
especific for GlusterFS tuning?

Only I found basic information for configuring Gluster, but I do no have 
information more deep (i.e. for experts), I think must exist  some 
option for manipulate this siuation on GlusterFS, moreover, other people 
should have the same problems, since we replicate
  the configuration in other site with the same results.
Perhaps, the question is about  the gluster scalability, how many 
clients is recommended for each gluster server when I use RDMA and 
infiniband fabric at 40 Gbps?

I would appreciate any help,  I want to use Gluster, but stability and 
reliability  is very important for us. Perhaps


claudio





[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux