Come on folks, you're making me feel like I should give up or something :) From Gordan; >I think a part of the problem is perception. Perception can only be what marketing says it will do. I can't say I have once seen anything that says it won't scale performance wise by virtue of what it is, a cluster. I looked at SSI and other types of clusters, this seemed to be the key for my LAMP based services. >leads to _LOWER_ performance on I/O bound processes. If it's CPU bound, Sure, there is a performance cost from each node but I would guess it's an acceptable cost so long as I can work out the I/O side of things. I'm guessing a lot of folks have come up with all sorts of good ways of handling this otherwise, no one would be using these tools. >then sure, it'll help. But on I/O it'll likely do you harm. It's more >about redundancy and graceful degradation than performance. There's no way >of getting away from the fact that a cluster has to do more work than a >single node, just because it has to keep itself in sync. When I started learning about the RH cluster suite and GFS, it was because the hype was that I could build a highly scalable, highly available environment where I could share data in a way I had not been able to before. That sharing data part has been true and I love what I can achieve with that alone. I am now at the performance stage, need to get the most I can out of what I have. Time to ask questions so that I can have some basic starting points. Even though each node has to do more work and the cluster costs me in performance overall, that seems to be pretty much like any other application out there. Applications cost part of the machines resources, that's just the way things are. >about redundancy and graceful degradation than performance. There's no way >of getting away from the fact that a cluster has to do more work than a >single node, just because it has to keep itself in sync. Got it. However, I've already spent months learning about the cluster suite, GFS, and much harder has been all of the networking involved, the fibre channel switches, the fibre channel storage and it's endless needs, the list goes on and on, don't want to bore you. The bottom line is, I have a working cluster, sharing GFS space. I know it's costing me some resources from each node, I understand this. However, there's plenty left to work with :). I could use some input on where to start to get the most performance I can out of what I have. >The only way clustering will give you scaleable performance benefit is >with partitioned (as opposed to shared) data. Shared data clustering is >about convenience and redundancy, not about performance. I agree but this is a very general statement. In my case, I have a LAMP application which benefits more from having shared GFS space. I might move to purely distributed at some point but for now, I'd prefer to find out what I can do with what I've built so far. So, looking for help. I am of course willing to give what ever information I can provide in order to get that help. Since I don't know the answer, I can't ask the right questions just yet so the question is basic. Where do I start looking for performance enhancements now that my cluster is ready? From Wendy; >Be aware that cluster management and its associated performance tuning is >really not a trivial task. It is kind of hard to give a "catch-all" >advice in a mailing list, particularly we have been participating the >discussions on our spare time basis. I don't think anyone who bothers to take this on would think that it's trivial or anything less than something they do have to spend some time at. Asking 'catch-all' questions is often the only way to get the ball rolling, to invite additional questions which often lead to more meaningful help. At least, in all of the endless meetings I've been at where it's fact finding, we'll often start with some basics and that turns into more relevant things. So, my question is again, the same :). Now that I have my cluster up and running, I still would like to ask those in the list for thoughts, input, ideas on where they started looking for performance enhancements. I have a basic, non cluster/GFS list of course; I'll need to work on fine tuning my web servers, my storage and even my networking. Only thing is, are there some things I should be aware of when doing this which are cluster/GFS related tips, input that others might have? Mike -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster