Re: GlusterFS vs Ceph and Hadoop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In addition I like to point out that:

- Glusterfs can leave your files in a whole piece. Call me old fashion, but for "normal" workload I like that
- ceph also needs an underlying filesystem too, so you will be left with fsck questions etc. too
- cephfs is, as of now, still not production ready and following the news this isn't the main focus right now.
So you either have to use rbd's (and what filesystem inside?) or you make you applications using RADOSGW.
So for my actual scenario (small amount of server, n sided replication as VM-image storage backend) 
glusterfs seems the better approach to me (less administration effort, keep my files in one piece)
hth

Bernhard

On Dec 29, 2013, at 1:50 AM, Peter B. <pb@xxxxxxxxxxxxxxxxx> wrote:

On 12/28/2013 08:06 PM, Knut Moe wrote:
Are there benefits in configuration, scaling, management etc?

Hi,
One year ago I was also in the situation where I had to compare these 3 cluster
file systems. I'm not the greatest expert on this topic, but here's a rough summary of what I know and why we chose GlusterFS over Hadoop or Ceph.

There are great differences between Hadoop/Ceph and GlusterFS.
First of all being that GlusterFS distributes files and works on top of
existing filesystems. It is completely transparent to the system and
applications.
This adds overhead (at costs of speed), but increases your options in
case of errors and long-term preservation.

Most other storage cluster systems on the other hand, split each file
into a certain number of blocks and distribute those.

Hadoop is not just some filesystem you can mount (as I understood from
reading Apache's docs): You have to talk to it using the Hadoop API. As
I understood it, you'd have to write your applications to use Hadoop,
instead of just having it as transparent filesystem beneath. Please
correct me if I'm wrong.

This makes a great difference regarding effort and use-cases.


What's also special about GlusterFS is that it's really (really) easy
and quick to set up a basic constellation, and it ships out-of-the-box
with most distro repositories. There are things you can tweak, but you
don't have to. It's understandable very quickly by any regular Linux admin.
Just yesterday, I've setup gluster on my raspberry pi home fileserver :)

There are many many more details about how Hadoop and Ceph "tick",
compared to GlusterFS (e.g. "NameNode"), but I think there are others
here who can explain that way better than me.


I know Hadoop is used by the likes of Yahoo and Facebook. I would be
interested in information on any large (known) users of GlusterFS.

When I was looking for systems for large storage clusters for long-term media
archiving, I initially thought I'd go for Hadoop, because "the big ones are
using it". I also listened to a presentation about Ceph, so I could compare them.

In the end, we chose GlusterFS, because for digital archiving, we needed a scalable storage cluster with the highest chance of maintaining it over the years, rather than minimizing our downtime to seconds. We are currently building up a storage for the national A/V archive with 2x >300 TiB.

I've already done initial tests with GlusterFS on one node, but it's too soon to really speak of "experience" on our side.
We've also only used gluster in "distribute" mode, so I have no experience with gluster-replication at all.


Regards,
Peter B.

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users



*Ecologic Institute*Bernhard Glomm
IT Administration

Phone:+49 (30) 86880 134
Fax:+49 (30) 86880 100
Skype:bernhard.glomm.ecologic
Website: | Video: | Newsletter: | Facebook: | Linkedin: | Twitter: | YouTube: | Google+:
Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin | Germany
GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: DE811963464
Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH



Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux