Re: Zookeeper instead of CLD in Hail

Jeff Darcy <jdarcy@xxxxxxxxxx> · Tue, 08 Jun 2010 11:07:37 -0400

On 06/07/2010 10:32 PM, Jeff Garzik wrote:
> I think you're overselling that angle a bit.  Google discourages use of 
> Chubby as a strict publish-subscribe mechanism, so watches and 
> ephemerals aren't the bread-and-butter of Chubby necessarily.  A lot of 
> the supposed commonality comes simply from the attribute of being a 
> centralized respository of data for autonomous cloud systems -- a 
> shared, highly reliable filesystem -- not any particular attribute 
> related to watches or ephemerals.

Nonetheless, those are features both share, and many might argue that
they're preferable to locks.  Locking is a fundamentally lousy way to
build scalable and reliable distributed systems, as has been well known
for more than a decade.  That's why databases have trended toward MVCC
or AP/EC, why programming models have migrated toward async queues and
STM or actors, etc.  If using Chubby or its derivatives as a pub/sub hub
is discouraged, then using it as a DLM should be outright condemned.
Somebody who had followed the recommended programming model(s) with
Chubby would have little trouble transitioning to ZK.  That's what I
meant by "familiar shape" - that people who've already learned how to
write distributed applications could still apply that hard-won knowledge
with Hail components.

> Not sure what you mean by familiar shape?  ZK design is too haphazard, 
> and developers IMO have an easier time grasping CLD's fundamental 
> FS-like API.

Only in the sense that shared-state locking can be implemented in
trivial systems more readily than other concurrency models.
Unfortunately, as those systems scale beyond the trivial, it usually
becomes much harder (if not impossible) to keep them working and
performing well.  Rewrites from locking to other models are common in
this space.  People who haven't yet learned these lessons shouldn't be
our target audience, and enabling their mistakes shouldn't be a goal.

> The implementation is also 3x or more in terms of size, 
> compared to CLD.

Small code size is not, in and of itself, a virtue.  It can mean more
efficient implementation, but it can also mean reduced functionality or
unaddressed concerns (especially wrt error conditions or observability).
 I'm sure I could write something vaguely like Chubby that's even
smaller than CLD, but it would be absurd to claim that it must therefore
be better.

> I'd rather get the basics right first, then pursue a popularity contest :)

Popularity is not a goal, but it can be a means.  The world doesn't need
another Walrus, but it doesn't need another SML/Haskell languishing in
obscurity either.  The open-source landscape is littered with
developers' special little flowers that withered away because nobody
ever made them even slightly interesting to a wider audience.  No
attempt to provide building blocks for distributed systems can succeed
while remaining ignorant of (or antagonistic toward) established ways of
building such systems.  Look at how much developers fawn over projects
like Tokyo or Redis, which are fundamentally non-scalable.  Look at how
they use systems like memcached - in production, successfully - despite
it having a clearly ill-thought-out approach to consistency.  If Hail is
supposed to be about building blocks, then those should be about the
building blocks that people need, not the building blocks that are easy
or fun to create.  If somebody who already had ZK running could point
chunkd/tabled at that instead of having to download/install/configure
CLD (and fight with their IT folks about the DNS dependency), then they
might actually be more likely to use chunkd/tabled instead of something
even more broken.  Some few of them might even start to contribute.
Surely that would be a good thing.  If self-imposed technical obstacles
still mean it's too much effort then fine, but I don't think we need
more NIH syndrome.
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html