Re: Help me specify/develop a feature! (cluster web sessions management)

"Richard Lynch" <ceo@xxxxxxxxx> · Wed, 14 Mar 2007 00:49:30 -0500 (CDT)

On Tue, March 13, 2007 7:27 pm, Mark wrote:
> I have a web session management server that makes PHP clustering easy
> and
> fast. I have been getting a number of requests for some level of
> redundancy.
>
> As it is, I can save to an NFS or GFS file system, and be redundant
> that
> way.

Talk to Jason at http://hostedlabs.com if you haven't already.

He's rolling out a distributed redundant PHP architecture using your
MCache as an almost turn-key webhosting service.

Not quite sure exactly how he's makeing the MCache bit redundant, but
he's already doing it.

> Here is an explanation of how it works:
> http://www.mohawksoft.org/?q=node/36

NB:
There is a typo in "False Scalability" section:
"... but regardless of what you do you, every design has a limit."

> What would you be looking for? How would you expect redundancy to
> work?

In the ideal world, the developers are also working as an N-tier
architecture in their Personnel Org Chart. :-)

One Lead has to understand the whole system and the intricacies of
your system, as well as its implications and "gotchas" really well.

In an ideal world the Lead can then arrange things so that other
Developers (non lead) can just program "normally" and have little or
no impact on their process to roll-out to the scalable architecture.

This is not to say that they can squander resources, but rather that
if their algorithm "works" correctly and quickly (enough) on their dev
box with beefy datasets, it should seemlessly work on the scaled
boxes, assuming the datasets are not dis-proportionately larger
comparing hardware to hardware pro-rated to dataset size.

Yes, if the algorithm is anything bigger than O(n) this is not really
"safe" but it's a close rule of thumb, and you can generally figure
out pretty fast if your algorithm is un-workable.

At least in my experience, if I can get it to "work" on a relatively
large dataset on my crappy dev box, the real server can deal with it.

So the less intrusive the redundant architecture can be, the better.

Documentation of exactly how it all works is crucial -- If it's all
hand-waving, the Lead will never be able to figure out where the
"gotchas" are going to be.

I'd also expect true redundancy all across the board, down to spare
screws for the rack-mounts.  Hey, a screw *could* sheer off at any
moment... :-)

Multiple data centers on a few different continents.
US, Europe, Asia, India (which seems to have caught the American
consumerism big-time lately...) Australia...
Probably need 2 or 3 just in the US.

Some folks need WAY more bigger farms than others. Offer a wide
variety of choices, from a simple failsafe roll-over up to
sky's-the-limit on every continent.
[Well, okay, you can probably safely skip Antartica. :-)]

I'd like a "status board" web panel of every significant piece of gear
and a health status in one screen of blinking lights. :-)

If I have to be the one to SysAdmin the things, make that a control
panel as well.

Okay, in reality, "I" would *not* be the one to SysAdmin that stuff,
as I would still need to hire a guy actually qualified to do that.

Which is why we're working with Jason (above) who's essentially our
out-source SysAdmin guy taking care of all this hardware and
redundancy stuff so we can focus on our WEb App from a business
perspective (mostly) instead of constantly fighting with hardware.
[I am so *not* a hardware guy...]

And, of course, *when* an MCache box falls over, the user should
seemlessly be sent to the next-closest box, with their session data
already waiting for them.

I.e., it's not enough that there will always be a working MCache box
for new users -- Logged-in users have to have their session data
replicated to at least one other box.

There also have to be enough "spare" cycles in the sum of all boxes
that a single failure won't just take them all down in a dominoe
effect. [shudder]

So it's gotta be more like Raid 5 or whatever it is, with the session
data striped across different boxes.

Something like this dominoe effect bit Dreamhost in the [bleep] awhile
back on their switch setup.  Actually, go read all their woes on their
blog/newsletter and don't do that. :-)
[Though at Dreamhost pricing, it's really hard to complain...]
[And at least they tell you they screwed up instead of making it a
State Secret.]

Speaking of pricing:
Session replication is just a tiny piece of the puzzle, really.  A
crucial piece, relatively easy to factor out for most web apps, and a
great target for optimization and modularization for that very reason.

But one also needs to make the web-farm, the app-farm, and the db-farm
all scalable...

So if you can do ALL of those in one nice package, or even some of
those, that's a Good Thing, imho, as many of the same issues you'll
have for session data are the same issues for web/app/db interaction.

Or you could specialize in the session replication, and be a vendor to
the folks replication whole systems -- Probably better, really, to
stay focussed.

But either way, the price has to be pretty low, or your target market
is mostly companies who already have folks on staff who already have
solved this problem...

Disclaimer:
I'm hardly an expert in this stuff!
You may want to ask on Internals, or try to button-hole some gurus at
a PHP conference.

-- 
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some starving artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php