Zookeeper instead of CLD in Hail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, guys:

I spent a few days playing with Zookeeper, with an eye on replacing
CLD with it. The short recommendation: don't do it, at least for now,
but reconsider if any sister services make a good use of it (e.g.
if MRG/DC image store does).

The easiest way to replace CLD would be to use Zookeeper as if it
were CLD, so I wrote a test that locked a file like cldu.c does now.
It was not too bad, but I learned two things:
 - what exactly Garzik was saying about "different focus" in Q&As
   after his presentations, and
 - locking anything is a really retarded thing to do in Zookeeper.

About the focus, ZK is just like CLD from a certain angle (it has
the good old files and provides a set of un-posixy operations
on them: watches, uniques, "ephemerals"), but it's also entirely
unlike CLD (e.g. no locks in the protocol). CLD's model is that
clients are daemons, each of which reads a few of its files, maybe
locks one or two at boot, and then nothing happens except keepalives.
Zookeeper's model... honestly I don't know what it is because it's
never explained concisely, but the docs that I saw seem to imply
huge numbers of clients all doing random ops all the time on the
same files, enough to cause a herd concerns. It looks like Yahoo
may be using Zookeeper as a lease manager or something. Crazy.

I heard people say they cribbed from the same Chubby paper, but
it's bollocks. It's absolutely nothing like what Chubby implies.
No locks for one thing. To be sure, Zookeeper provides a canned
piece of code which implements locks, kinda like you can implement
compare-and-swap using Dekker's algorithm on a CPU that doesn't
have it. The canned lock creates "sequenced" files (using a ZK
server call that creates unique filenames), then sets some
"watches" (same as CLD offers), then re-reads the directory to
find the lowest number sequential file, which is the winner of
the lock. Haha, only serious. I tested it, it works, but ewwwww.

They clearly want daemons to approach the whole problem in a
different way. For example, there's a similar canned recipy to
identify a "leader" client.

Overall, ZK seems like a mature, if quirky system. Quirky means that
I made my client OOM hard by using wrong compilation options, and it
took me a while to figure it out (PROTIP: do not use "single-threaded"
mode in Zookeeper, it is not loved and canned recipies may plain not
work with it). There were some other weird stories. But it definitely
works. Unfortunately, with the latest fix for the timer CLD works too:
I've not seen a server crash in a couple of months. So I do not see
an upside for us to switch at this point, and I have better things
to do than learning Zookeeper ropes for weeks.

BTW, Zookeeper is not packaged in Fedora. You have to install it
by hand. Thank heavens for /usr/local.

Dunno what their community is like. I'm going to send a trivial
patch to them and see what happens.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Fedora Clound]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux