Hi list I would like to share with you the issues I have been suffered with glusterfs in a web-farm production environment. I hope it can help anybody and if anybody can help me on this I created a glusterfs environment like the following: 6 servers, amd64, 4 cores, 4/8 Gb RAM, 1TB disk each one kernel 2.6.26-2-xen-amd64 In each physical server, there is a xen (or two) virtual machine running Phy server only serves glusterfs to the clients Xen servers (4 apache, 2 mysql, 2 load balancers -LVS -- and a plesk server) ----------- phy server ----------- volume posix1 type storage/posix option directory /mnt/export end-volume volume locks1 type features/locks subvolumes posix1 end-volume volume brick1 type performance/io-threads option thread-count 8 subvolumes locks1 end-volume volume server-tcp type protocol/server option transport-type tcp option auth.addr.brick1.allow * option transport.socket.listen-port 6996 option transport.socket.nodelay on subvolumes brick1 end-volume -------------- 5 xen servers has this configuration ------------- xen server------------ volume nodo1 type protocol/client option transport-type tcp option remote-host 10.0.0.11 option transport.socket.nodelay on option remote-port 6996 option remote-subvolume brick1 end-volume volume nodo2 type protocol/client option transport-type tcp option remote-host 10.0.0.12 option transport.socket.nodelay on option remote-port 6996 option remote-subvolume brick1 end-volume volume nodo3 type protocol/client option transport-type tcp option remote-host 10.0.0.13 option transport.socket.nodelay on option remote-port 6996 option remote-subvolume brick1 end-volume volume nodo4 type protocol/client option transport-type tcp option remote-host 10.0.0.14 option transport.socket.nodelay on option remote-port 6996 option remote-subvolume brick1 end-volume volume nodo5 type protocol/client option transport-type tcp option remote-host 10.0.0.15 option transport.socket.nodelay on option remote-port 6996 option remote-subvolume brick1 end-volume volume nodo6 type protocol/client option transport-type tcp option remote-host 10.0.0.16 option transport.socket.nodelay on option remote-port 6996 option remote-subvolume brick1 end-volume volume mirror-0 type cluster/replicate subvolumes nodo1 nodo2 nodo3 end-volume volume mirror-1 type cluster/replicate subvolumes nodo4 nodo5 nodo6 end-volume volume distribute type cluster/distribute subvolumes mirror-0 mirror-1 end-volume volume readahead type performance/read-ahead option page-count 4 subvolumes distribute end-volume volume iocache type performance/io-cache option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed 's/[^0-9]//g') / 5120 ))`MB option cache-timeout 1 subvolumes readahead end-volume volume quickread type performance/quick-read option cache-timeout 1 option max-file-size 512kb subvolumes iocache end-volume volume writebehind type performance/write-behind option cache-size 4MB subvolumes quickread end-volume volume statprefetch type performance/stat-prefetch subvolumes writebehind end-volume ----------------------- There is a dedicated switch to serve glusterfs, and the 2nd nic of the servers is connected to this. The problems: In phy servers and xen servers, glusterfs process is consumming a lot of RAM... so the overall perfommance in the clients is poor, but not always... only in a punctual moments. When it occurs, I stop glusterfs in the nodes and I start them again and everything turns to go ok. If the web domain served by the apaches is not too much complicated (few php files, jpg/gif files) the perfommance is good. But if there is a joomla, magento or some type of product with too much files alocated in too much directories, the perfommance is not so good (from 2-3 seconds in optimal conditions to 7-8 seconds, more or less) The thing is that for the moment I have had to deactivate glusterfs because the dealers are complaining with the perfommance.. and I'm looking for other solution (NFS, DRBD), but I don't want to discard glusterfs because I believe this is a good solution for the future. I hope glusterfs version 3.1 will solve many problems I have today, but, looking into my configuration files, can anybody tell me if I could do better than it is? Many thanks in advance Kyle -------------------------------------------------------------------------------------------- Busca, compara... pero busca en Qu?! Desc?brelo en http://buscar.que.es/ -------------------------------------------------------------------------------------------- Correo enviado desde http://www.ozu.es