> > Is there anything written up on why you/all want every > > node to be completely conscious of every other node? > > > > I could see a couple of architectures that might work > > better (be more scalable) if the config minutiae were > > either not necessary to be shared or shared in only > > cases where the config minutiae were a dependency. > > Well, these aren't exactly minutiae. Everything at file or directory level is > fully distributed and will remain so. We're talking only about stuff at the > volume or server level, which is very little data but very broad in scope. > Trying to segregate that only adds complexity and subtracts convenience, > compared to having it equally accessible to (or through) any server. Sorry, I didn't have time this morning to add more detail. Note that my concern isn't bandwidth, its flexibility; the less knowledge needed the more I can do crazy things in user land, like running boxes in different data centres and randomly power things up and down, randomly re- address, randomly replace in-box hardware, load balance, NAT, etc. It makes a dynamic environment difficult to construct, for example, when Gluster rejects the same volume-id being presented to an existing cluster from a new GFID. But there's no need to go even that complicated, let me pull out an example of where shared knowledge may be unnecessary; The work that I was doing in Gluster (pre glusterd) drove out one primary "server" which fronted a Replicate volume of both its own Distribute volume and that of another server or two - themselves serving a single Distribute volume. So the client connected to one server for one volume and the rest was black box / magic (from the client's perspective - big fast storage in many locations); in that case it could be said that servers needed some shared knowledge, while the clients didn't. The equivalent configuration in a glusterd world (from my experiments) pushed all of the distribute knowledge out to the client and I haven't had a response as to how to add a replicate on distributed volumes in this model, so I've lost replicate. But in this world, the client must know about everything and the server is simply a set of served/presented disks (as volumes). In this glusterd world, then, why does any server need to know of any other server, if the clients are doing all of the heavy lifting? The additional consideration is where the server both consumes and presents, but this would be captured in the client side view. i.e. given where glusterd seems to be driving, this knowledge seems to be needed on the client side (within glusterfs, not glusterfsd). To my mind this breaks the gluster architecture that I read about 2009, but I need to stress that I didn't get a reply to the glusterd architecture question that I posted about a month ago; so I don't know if glusterd is currently limiting deployment options because; - there is an intention to drive the heavy lifting to the client (for example for performance reasons in big deployments), or; - there are known limitations in the existing bricks/ modules (for example moving files thru distribute), or; - there is ultimately (long term) more flexibility seen in this model (and we're at a midway point between pre glusterd and post so it doesn't feel that way yet), or; - there is an intent to drive out a particular market outcome or match an existing storage model (the gluster presentation was driving towards cloud, and maybe those vendors don't use server side implementations), etc. As I don't have a clear/big picture in my mind; if I'm not considering all of the impacts, then my apologies. > > RE ZK, I have an issue with it not being a binary at > > the linux distribution level. This is the reason I don't > > currently have Gluster's geo replication module in > > place .. > > What exactly is your objection to interpreted or JIT compiled languages? > Performance? Security? It's an unusual position, to say the least. > Specifically, primarily, space. Saturn builds GlusterFS capacity from a 48 Megabyte Linux distribution and adding many Megabytes of Perl and/or Python and/or PHP and/or Java for a single script is impractical. My secondary concern is licensing (specifically in the Java run-time environment case). Hadoop forced my hand; GNU's JRE/compiler wasn't up to the task of running Hadoop when I last looked at it (about 2 or 3 years ago now) - well, it could run a 2007 or so version but not current ones at that time - so now I work with Gluster .. Going back to ZkFarmer; Considering other architectures; it depends on how you slice and dice the problem as to how much external support you need; > I've long felt that our ways of dealing with cluster > membership and staging of config changes is not > quite as robust and scalable as we might want. By way of example; The openMosix kernel extensions maintained their own information exchange between cluster nodes; if a node (ip) was added via the /proc interface, it was "in" the cluster. Therefore cluster membership was the hand-off/interface. It could be as simple as a text list on each node, or it could be left to a user space daemon which could then gate cluster membership - this suited everyone with a small cluster. The native daemon (omdiscd) used multicast packets to find nodes and then stuff those IP's into the /proc interface - this suited everyone with a private/dedicated cluster. A colleague and I wrote a TCP variation to allow multi-site discovery with SSH public key exchanges and IPSEC tunnel establishment as part of the gating process - this suited those with a distributed/ part-time cluster. To ZooKeeper's point (http://zookeeper.apache.org/), the discovery protocol that we created was weak and I've since found a model/algorithm that allows for far more robust discovery. The point being that, depending on the final cluster architecture for gluster (i.e. all are nodes are peers and thus all are cluster members, nodes are client or server and both are cluster members, nodes are client or server and only clients [or servers] are cluster members, etc) there may be simpler cluster management options .. Cheers, -- Ian Latter Late night coder .. http://midnightcode.org/