On Tue, March 13, 2007 7:27 pm, Mark wrote: > I have a web session management server that makes PHP clustering easy > and > fast. I have been getting a number of requests for some level of > redundancy. > > As it is, I can save to an NFS or GFS file system, and be redundant > that > way. Talk to Jason at http://hostedlabs.com if you haven't already. He's rolling out a distributed redundant PHP architecture using your MCache as an almost turn-key webhosting service. Not quite sure exactly how he's makeing the MCache bit redundant, but he's already doing it. > Here is an explanation of how it works: > http://www.mohawksoft.org/?q=node/36 NB: There is a typo in "False Scalability" section: "... but regardless of what you do you, every design has a limit." > What would you be looking for? How would you expect redundancy to > work? In the ideal world, the developers are also working as an N-tier architecture in their Personnel Org Chart. :-) One Lead has to understand the whole system and the intricacies of your system, as well as its implications and "gotchas" really well. In an ideal world the Lead can then arrange things so that other Developers (non lead) can just program "normally" and have little or no impact on their process to roll-out to the scalable architecture. This is not to say that they can squander resources, but rather that if their algorithm "works" correctly and quickly (enough) on their dev box with beefy datasets, it should seemlessly work on the scaled boxes, assuming the datasets are not dis-proportionately larger comparing hardware to hardware pro-rated to dataset size. Yes, if the algorithm is anything bigger than O(n) this is not really "safe" but it's a close rule of thumb, and you can generally figure out pretty fast if your algorithm is un-workable. At least in my experience, if I can get it to "work" on a relatively large dataset on my crappy dev box, the real server can deal with it. So the less intrusive the redundant architecture can be, the better. Documentation of exactly how it all works is crucial -- If it's all hand-waving, the Lead will never be able to figure out where the "gotchas" are going to be. I'd also expect true redundancy all across the board, down to spare screws for the rack-mounts. Hey, a screw *could* sheer off at any moment... :-) Multiple data centers on a few different continents. US, Europe, Asia, India (which seems to have caught the American consumerism big-time lately...) Australia... Probably need 2 or 3 just in the US. Some folks need WAY more bigger farms than others. Offer a wide variety of choices, from a simple failsafe roll-over up to sky's-the-limit on every continent. [Well, okay, you can probably safely skip Antartica. :-)] I'd like a "status board" web panel of every significant piece of gear and a health status in one screen of blinking lights. :-) If I have to be the one to SysAdmin the things, make that a control panel as well. Okay, in reality, "I" would *not* be the one to SysAdmin that stuff, as I would still need to hire a guy actually qualified to do that. Which is why we're working with Jason (above) who's essentially our out-source SysAdmin guy taking care of all this hardware and redundancy stuff so we can focus on our WEb App from a business perspective (mostly) instead of constantly fighting with hardware. [I am so *not* a hardware guy...] And, of course, *when* an MCache box falls over, the user should seemlessly be sent to the next-closest box, with their session data already waiting for them. I.e., it's not enough that there will always be a working MCache box for new users -- Logged-in users have to have their session data replicated to at least one other box. There also have to be enough "spare" cycles in the sum of all boxes that a single failure won't just take them all down in a dominoe effect. [shudder] So it's gotta be more like Raid 5 or whatever it is, with the session data striped across different boxes. Something like this dominoe effect bit Dreamhost in the [bleep] awhile back on their switch setup. Actually, go read all their woes on their blog/newsletter and don't do that. :-) [Though at Dreamhost pricing, it's really hard to complain...] [And at least they tell you they screwed up instead of making it a State Secret.] Speaking of pricing: Session replication is just a tiny piece of the puzzle, really. A crucial piece, relatively easy to factor out for most web apps, and a great target for optimization and modularization for that very reason. But one also needs to make the web-farm, the app-farm, and the db-farm all scalable... So if you can do ALL of those in one nice package, or even some of those, that's a Good Thing, imho, as many of the same issues you'll have for session data are the same issues for web/app/db interaction. Or you could specialize in the session replication, and be a vendor to the folks replication whole systems -- Probably better, really, to stay focussed. But either way, the price has to be pretty low, or your target market is mostly companies who already have folks on staff who already have solved this problem... Disclaimer: I'm hardly an expert in this stuff! You may want to ask on Internals, or try to button-hole some gurus at a PHP conference. -- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some starving artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php