distributed storage and computing

purpleidea at gmail.com (James) · Mon, 28 Oct 2013 12:34:24 -0400

Since this is sort of an open ended question, I'm top posting with a
general suggestion.

You need to take 15 minutes or a day, to _try it out_. It's pretty
easy to do manually, or if you want, take a few vm's and try
puppet-gluster. It should let you build a gluster pool very quickly.

https://github.com/purpleidea/puppet-gluster
or mirrored also at:
https://forge.gluster.org/puppet-gluster

Come back with more questions after that :)

James

On Mon, Oct 28, 2013 at 9:13 AM, Tim van Elteren
<timvanelteren at gmail.com> wrote:
> Dear Gluster Community,
>
> After researching a number of distributed file systems for deployment in a
> production environment with the main purpose of performing both batch and
> real-time distributed computing I've identified Gluster as a potential
> solution.
>
> The key properties that our system should exhibit:
>
> - an open source, liberally licensed, yet production ready, e.g. a mature,
> reliable, community and commercially supported solution;
> - ability to run on commodity hardware, preferably be designed for it;
> - provide high availability of the data with the most focus on reads;
> - high scalability, so operation over multiple data centres, possibly
> global;
> - removal of single points of failure with the use of replication and
> distribution of (meta-)data.
>
> The sensitivity points that were identified, and resulted in the following
> questions, are:
>
> 1) transparency to the processing layer / application with respect to data
> locality, e.g. know where data is physically located on a server level,
> mainly for resource allocation and fast processing, high performance, how
> can this be accomplished using GlusterFS?
>
> 2) posix compliance, or conformance: hadoop for example isn't posix
> compliant by design, what are the pro's and con's? What is GlusterFSs
> approach with respect to support for posix operations?
>
> 3) mainly with respect to evaluating the production readiness of GlusterFS,
> where is it currently used in production environments and for what specific
> usecases it seems most suitable? Are there any known issues / common
> pitfalls and workarounds available?
>
> I realize that I've posed quite a lot of questions above but any answer or
> help, or links to where the information could be found, are very much
> appreciated :) In addition, specifically for GlusterFS:
>
> 4) It seems Gluster has the advantage in Geo replication versus for example
> Ceph. What are the main advantages here?
> 5) Finally what would be the most compelling reason to go for Gluster and
> not for the alternatives?
>
> I'm looking forward to your replies. Thanks in advance! :)
>
> With kind regards,
>
> Tim van Elteren
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users