On 09/06/2014 05:55 PM, Pranith Kumar Karampuri wrote: > > On 09/05/2014 03:51 PM, Kaushal M wrote: >> GlusterD performs the following functions as the management daemon for >> GlusterFS: >> - Peer membership management >> - Maintains consistency of configuration data across nodes >> (distributed configuration store) >> - Distributed command execution (orchestration) >> - Service management (manage GlusterFS daemons) >> - Portmap service for GlusterFS daemons >> >> >> This proposal aims to delegate the above functions to technologies >> that solve these problems well. We aim to do this in a phased manner. >> The technology alternatives we would be looking for should have the >> following properties, >> - Open source >> - Vibrant community >> - Good documentation >> - Easy to deploy/manage >> >> This would allow GlusterD's architecture to be more modular. We also >> aim to make GlusterD's architecture as transparent and observable as >> possible. Separating out these functions would allow us to do that. >> >> Bulk of current GlusterD code deals with keeping the configuration of >> the cluster and the volumes in it consistent and available across the >> nodes. The current algorithm is not scalable (N^2 in no. of nodes) >> and doesn't prevent split-brain of configuration. This is the problem >> area we are targeting for the first phase. >> >> As part of the first phase, we aim to delegate the distributed >> configuration store. We are exploring consul [1] as a replacement for >> the existing distributed configuration store (sum total of >> /var/lib/glusterd/* across all nodes). Consul provides distributed >> configuration store which is consistent and partition tolerant. By >> moving all Gluster related configuration information into consul we >> could avoid split-brain situations. > Did you get a chance to go over the following questions while making the > decision? If yes could you please share the info. > What are the consistency guarantees for changing the configuration in > case of network partitions? > specifically when there are 2 nodes and 1 of them is not reachable? > consistency guarantees when there are more than 2 nodes? > What are the consistency guarantees for reading configuration in case of > network partitions? Consul documentation claims that it can recover from network partition. http://www.consul.io/docs/internals/jepsen.html However having said that we are yet to do this POC. ~Atin > > Pranith >> >> All development efforts towards this proposal would happen in parallel >> to the existing GlusterD code base. The existing code base would be >> actively maintained until GlusterD-2.0 is production-ready. >> >> This is in alignment with the GlusterFS Quattro proposals on making >> GlusterFS scalable and easy to deploy. This is the first phase ground >> work towards that goal. >> >> Questions and suggestions are welcome. >> >> ~kaushal >> >> [1] : http://www.consul.io/ >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users@xxxxxxxxxxx >> http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel