> No one node or set of nodes should hold the > cluster hostage. Agreed - this is fundamental. > We are revisiting this situation now because we > want to scale to 1000s of nodes potentially. Good, I hate upper bounds on architectures :) Though I haven't tested my own implementation, I understand that one implementation of the discovery protocol that I've used, scaled to 20,000 hosts across three sites in two countries; this is the the type of robust outcome that can be manipulated at the macro scale - i.e. without manipulating per-node details. > Gluster CLI operations should not time out or > slow down. This is critical - not just the CLI but also the storage interface (in a redundant environment); infrastructure wears and fails, thus failing infrastructure should be regarded as the norm/ default. > If ZK requires proprietary JRE for stability, > Java will be NO NO!. *Fantastic* > My point is to keep things simple as we scale. I couldn't agree more. In that principle I ask that each dependency on cluster knowledge be considered carefully with a minimalist approach. -- Ian Latter Late night coder .. http://midnightcode.org/