Thanks so much for the reply, hopefully this will lead to something. On Fri, 2008-10-03 at 17:25 +0100, Gordan Bobic wrote: > It sounds like you have a SAN (fibre attached storage) that you are trying to turn into a NAS. That's justifiable if you have multiple mirrored SANs, but makes a mockery of HA if you only have one storage device since it leaves you with a single point of failure regardless of the number of front end nodes. Understood on the san single point of failure. We're addressing HA on the front end, don't have the money to address the backend yet. Storage is something you set up once and don't have to mess with again, and doesn't do things like have application issues, etc, it's just storage. So barring a hardware issue not covered by redundant power supplies, spare disks, etc, it doesn't have issues. Having a cluster on the front end allows for failure of software on one, being able to reboot one, and provide zero downtime to the clients. > Do you have a separate gigabit interface/vlan just for cluster communication? RHCS doesn't use a lot of sustained bandwidth but performance is sensitive to latencies for DLM comms. If you only have 2 nodes, a direct crossover connection would be ideal. Not sure how to accomplish that. How do you get certain services of the cluster environment to talk over 1 interface, and other services (such as the shares) over another? The only other interface I have configured is for the fence device (dell drac cards). > How big is your data store? Are files large or small? Are they in few directories with lots of files (e.g. Maildir)? Very much mixed. We have SAS and SATA in the same SAN device, and carved out based on application performance need. Some large volumes (7TB), some small (2GB). Some large files (video) down to the mix of millions of 1k user files. > Load averages will go up - that's normal, since there is added latency (round trip time) from locking across nodes. Unless your CPUs is 0% idle, the servers aren't running out of steam. So don't worry about it. Understood. That was just the measure I used as comparison. There is definite performance lag during these higher load averages. What I was trying (and doing poorly) to communicate was that all we are doing here is serving files over nfs..we're not running apps on the cluster itself...difficult for me to understand why file serving would be so slow or ever drive load up on a box that high. And, the old file server, did not have these performance issues doing the same tasks with less hardware, bandwith, etc. > Also note that a clustered FS will _ALWAYS_ be slower than a non-clustered one, all things being equal. No exceptions. Also, if you are load sharing across the nodes, and you have Maildir-like file structures, it'll go slower than a purely fail-over setup, even on a clustered FS (like GFS), since there is no lock bouncing between head nodes. For extra performance, you can use a non-clustered FS as a failover resource, but be very careful with that since dual mounting a non-clustered FS will destroy the volume firtually instantly. Agreed. That's not the comaprison though. Our old file server was running a clustered file system from Tru64 (AdvFS). Our expectation was that a newer technology, plus a major upgrade in hardware, would result in better performance at least than what we had, it has not, it is far worse. > Provided that your data isn't fundamentally unsuitable for being handled by a clustered load sharing setup, you could try increasing lock trimming and increasing the number of resource groups. Search through the archives for details on that. Can you point me in the direction of the archives? I can't seem to find them? > More suggestions when you provide more details on what your data is like. My apologies for the lack of detail, I'm a bit lost as to what to provide. It's basic files, large and small. User volumes, webserver volumes, postfix mail volumes, etc. Thanks so much! > Gordan > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster