Re: Starter Cluster / GFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/18/2010 09:23 PM, Nicolas Ross wrote:
Hi again !

I am begining to play with my new servers. I got for starter 2 nodes (1u
intel server platform, with a LSI Logic FC949ES FC card). I am like a child
playing with his new toy at christmas...

So, now I have a few points and questions. Sorry if it's long.

1. Raid sets

So, I made up a 2-node cluster for the moment. I was able to bring up
the cluster and make a GFS file system, in fact 2. We've made some test
with different strategy of raid. Our first idea for the gfs was to use 5
1tb disks in raid 5. With that I got a 4 tb fs. It has been suggested
previously that might not be a good idea. Our controler don't support
directly raid 10 wich seems to be the consensus of a better setup. We
will be making the 0 part on linux.

I made 2 raid 1 sets of 1tb (2 disks) on our raid enclosure, and added
them to a single vg. I created a lv on top of that, so I yield with a 2
tb fs. We don't plan on using striping on the lv (-i2) because of the
overhead if we add more space we will need to add 2 sets of raid1. So we
plan on making a "starter" gfs with those 2 sets (2tb total). It's
nearly double the 1.1 tb we have now, so we'll start with that.

Now, we made some write test with dd, and judging by the disk activity,
all data was writen to the first disk (pair of) of the vg, and never the
second one. I assume that once the first disk is full, it'll start
writing to the 2nd one. In the long term, I don't beleive it'll be a
problem, but I'd prefer if the data was written alternativly on both
disks without using stripes. Is that possible ? I looked at the --alloc
option to the vgcreate, but it doesn't seem to be that.

RAID 0 means that data should be evenly split between either array member. I would suspect a problem there. If the RAID controller is a simple on (I am not familiar with the model), I might suggest building the entire array in Linux. It would be interesting to see the performance difference, if any. Though I prefer that mainly because I am more familiar with mdadm, so take that recommendation with a grain of salt.

2. Network setup.

All our new servers have 3 nics, one being dedicated on to the
mamagement module. I will be using the first one to make a private
network that will be serving my services. In my new setup real routable
ips will terminated at the router and will be nated to the private ones
for eventual load-balancing. I will be using the second network on a
different vlan and subnet for cluster communications. The management
modules will be on that same vlan. So is this a good setup ? Should I be
doing something differently ?

Sounds okay to me. I like having a dedicated subnet for data syncing, but what really matters is that cluster communication and fencing are separate from Internet traffic.

3. Deadlocks

I found a small c program for testing the locks/s that is possible on a
file accessed similtunously on many nodes. (It's ping_pong, some fo you
might have used it). So, one of the parameters of that program is the
number of nodes using that file +1. On one test, I used 2 in stead of 3
on one of the node. Both profram on both nodes seemed stuck, not
killable, not even -9. So I must assume that they were in some kind of
deadlock. dlm_tool deadlock_check didn't show anything, and I can't make
heads or tails from gfs2_tool lockdump or what to do with it. I was
forced to reboot (forcebly) one of the node. Most likely on my
production environement we won't arrive to that situation. But I want to
know what happed and what to do to prevent it or stop that kind of lock.

Do you have your fence devices configured and working properly? A failure to fence can hang a cluster. Also, are you using managed switches and have either IGMP snooping or spanning tree enabled?

--
Digimer
E-Mail: digimer@xxxxxxxxxxx
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster


[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux