Performance report and some issues

Jordi Moles <jordi@xxxxxxxxx> · Thu, 06 Mar 2008 13:58:49 +0100

Hi,

I want to report back the performance issues i've had so far with 
glusterfs mainline 2.5, patch 690 and fuse-2.7.2glfs8.

I'm setting a mail system, which is all ran by Xen 3.2.0 and every 
"actual" piece of the mail system is a virtual machine from xen.

Anyway... the virtual machines accessing glusterfs are 6 dovecots and 4 
postfixs.  There are also 6 nodes, which share their own disk to the 
gluster filesystem. Two of the nodes, share 2 disks, one for the 
glusterfs, and the other for the namespace

these are the conf files:

****nodes with namespace****

volume esp
   type storage/posix
   option directory /mnt/compartit
end-volume

volume espa
   type features/posix-locks
   subvolumes esp
end-volume

volume espai
  type performance/io-threads
  option thread-count 15
  option cache-size 512MB
  subvolumes espa
end-volume

volume nm
   type storage/posix
   option directory /mnt/namespace
end-volume

volume ultim
   type protocol/server
   subvolumes espai nm
   option transport-type tcp/server
   option auth.ip.espai.allow *
   option auth.ip.nm.allow *
end-volume

*************

***nodes without namespace*****

volume esp
   type storage/posix
   option directory /mnt/compartit
end-volume

volume espa
   type features/posix-locks
   subvolumes esp
end-volume

volume espai
  type performance/io-threads
  option thread-count 15
  option cache-size 512MB
  subvolumes espa
end-volume

volume ultim
   type protocol/server
   subvolumes espai
   option transport-type tcp/server
   option auth.ip.espai.allow *
end-volume

*****************************

***clients****

volume espai1
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.204
   option remote-subvolume espai
end-volume

volume espai2
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.205
   option remote-subvolume espai
end-volume

volume espai3
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.206
   option remote-subvolume espai
end-volume

volume espai4
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.207
   option remote-subvolume espai
end-volume

volume espai5
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.213
   option remote-subvolume espai
end-volume

volume espai6
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.214
   option remote-subvolume espai
end-volume

volume namespace1
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.204
   option remote-subvolume nm
end-volume

volume namespace2
   type protocol/client
   option transport-type tcp/client
   option remote-host 192.168.1.205
   option remote-subvolume nm
end-volume

volume grup1
   type cluster/afr
   subvolumes espai1 espai2
end-volume

volume grup2
   type cluster/afr
   subvolumes espai3 espai4
end-volume

volume grup3
   type cluster/afr
   subvolumes espai5 espai6
end-volume

volume nm
   type cluster/afr
   subvolumes namespace1 namespace2
end-volume

volume ultim
   type cluster/unify
   subvolumes grup1 grup2 grup3
   option scheduler rr
   option namespace nm
end-volume

************

The thing is that in earlier patches, the whole system used to hang, 
with many different error messages.

Right now, it's been on for days without any hang at all, but i'm facing 
serious performance issues.

By only running an "ls" command, it can take like 3 seconds to show 
something when the system is "under load". It doesn't happen at all when 
there's no activity, so i don't thing has anything to do with xen. Well, 
actually, "under load" can mean 3 mails arriving per second. I'm 
monitoring everything, and no virtual machine is using more than 20% of 
cpu or so.

First, i had log level on both nodes and clients set to DEBUG, but now 
is just WARNING, and i've restarted everything so many times.

I was suggested to use "type performance/io-threads" on the node side. 
It actually worked, before that, it wasn't 3 seconds, but 5 or more. 
I've set the "thread-count" value to different values and also "cache-size"

The system is supposed to handle a big amount of traffic, far more than 
3 mails a second.

What do you think about the whole set up? Should i keep using namespace? 
Should i use new nodes for namespaces? Should i use different values for 
iothread?

One last thing... i'm using reiserfs on the "storage devices" that nodes 
share. Should i be using XFS or something else?

Logs don't show any kind of error now... i don't have a clue of what is 
failing now....

I would be pleased if you could give some ideas.

Thank you.