Hello there gluster developers and users, I'm trying to get a handle on what it takes to get glusterfs to work reliable. After several weeks of testing we have to date not been able to get it to work stable in our setup, and I'm beginning to wonder if there is a possible statistical approach to finding out what works and what doesn't rather than to try to go about it one bug at a time. The goal is a 'compatibility' list of sorts, things that you should do if you intend to run glusterfs in production, and to get a percentage wise idea of the number of installations that run smoothly and how many people are experiencing issues. One of the more nasty pitfalls that we ran in to was that the build does not warn if a kernel module is being built that it will never load because of a module linked with the kernel. This means that even though you think you are running an updated fuse in fact you may be using the one that got linked in with the kernel. This is easy to check by running lsmod, if that does not show the fuse module then you are using the kernel one. This combined with the various reports of success here make me want to ask the group on this list the following: - what is your configuration hardware - what is your configuration glusterfs setup - have you experienced problems building / installing / using the system And most interesting if you have not experienced any difficulty at all is there a single point that you can indicate that you think sets your setup apart from the ones that fail (or a common element between yours and the ones that don't). This might help to compile a checklist of elements that might help to create a 'must have' set of conditions in order to be able to run glusterfs stable in a production environment. It will also help to give me a feeling if not having glusterfs working right 'out of the box' is the rule rather than the exception. For starters here is my setup: 5 node cluster, dual opterons, 8G ram per box, supermicro chassis, 200G sata drives. 100 Mb link to the net, GigE backchannel between the nodes. The machines run Debian 'etch' 64 bits linux, kernel version is 2.6.17.11. Fuse has been upgraded to the glfs4 patch. Glusterfs configuration: readahead / writebehind / unify over all nodes in the cluster (currently only 4 because one machine developed a hardware problem). Initially we ran version 1.3.1, but with a lot of problems, this was later traced to the fuse module not loading (see above). After that we upgraded to the current tla release (504), we've had one issue of a glusterfs client process looping, we're trying to track that bug as well as another one that caused some problems while running tests. The tests we are running are 'dbench' with run lengths of up to 3 days, sometimes it fails quickly and sometimes it takes longer, dbench simulates a number of users on a lan accessing a file server and it has so far served well as a way to bring issues to the surface. Current status, not stable enough for any production work, incrementally improving stability with some setbacks. best regards, Jacques Mattheij