Re: Starter Cluster / GFS

Gordan Bobic <gordan@xxxxxxxxxx> · Fri, 19 Nov 2010 08:32:48 +0000

Nicolas Ross wrote:
1. Raid sets

So, I made up a 2-node cluster for the moment. I was able to bring up 
the cluster and make a GFS file system, in fact 2. We've made some test 
with different strategy of raid. Our first idea for the gfs was to use 5 
1tb disks in raid 5. With that I got a 4 tb fs. It has been suggested 
previously that might not be a good idea. Our controler don't support 
directly raid 10 wich seems to be the consensus of a better setup. We 
will be making the 0 part on linux.

I made 2 raid 1 sets of 1tb (2 disks) on our raid enclosure, and added 
them to a single vg. I created a lv on top of that, so I yield with a 2 
tb fs. We don't plan on using striping on the lv (-i2) because of the 
overhead if we add more space we will need to add 2 sets of raid1. So we 
plan on making a "starter" gfs with those 2 sets (2tb total). It's 
nearly double the 1.1 tb we have now, so we'll start with that.

Now, we made some write test with dd, and judging by the disk activity, 
all data was writen to the first disk (pair of) of the vg, and never the 
second one. I assume that once the first disk is full, it'll start 
writing to the 2nd one. In the long term, I don't beleive it'll be a 
problem, but I'd prefer if the data was written alternativly on both 
disks without using stripes. Is that possible ? I looked at the --alloc 
option to the vgcreate, but it doesn't seem to be that.

Is this the storage you are sharing between the nodes? If so, how 
exactly are you doing it?

Also, you do realize that you don't have to use LVM at all? It is 
entirely optional.

2. Network setup.

All our new servers have 3 nics, one being dedicated on to the 
mamagement module. I will be using the first one to make a private 
network that will be serving my services. In my new setup real routable 
ips will terminated at the router and will be nated to the private ones 
for eventual load-balancing. I will be using the second network on a 
different vlan and subnet for cluster communications. The management 
modules will be on that same vlan. So is this a good setup ? Should I be 
doing something differently ?

In theory, your cluster/storage interface should be the same interface 
you access the fencing devices over. As long as you stick to that, it 
should be OK.

3. Deadlocks

I found a small c program for testing the locks/s that is possible on a 
file accessed similtunously on many nodes. (It's ping_pong, some fo you 
might have used it). So, one of the parameters of that program is the 
number of nodes using that file +1. On one test, I used 2 in stead of 3 
on one of the node. Both profram on both nodes seemed stuck, not 
killable, not even -9. So I must assume that they were in some kind of 
deadlock. dlm_tool deadlock_check didn't show anything, and I can't make 
heads or tails from gfs2_tool lockdump or what to do with it. I was 
forced to reboot (forcebly) one of the node. Most likely on my 
production environement we won't arrive to that situation. But I want to 
know what happed and what to do to prevent it or stop that kind of lock.

What version of RHEL are you using? Early versions of RHEL5 had GFS2 
lock-up issues like you're describing. IIRC, GFS2 was only considered 
stable from around RHEL 5.5 (technology-preview-only in earlier versions 
of RHEL). Try with GFS1, it's a lot more mature.

Gordan

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster