On Wed, Sep 5, 2012 at 12:00 PM, John Doe <jdmls@xxxxxxxxx> wrote: > From: Dennis Jacobfeuerborn <dennisml@xxxxxxxxxxxx> > > >On 09/05/2012 07:14 AM, Bob Hepple wrote: > >> Another factor is that the available space is the physical space > >> divided by 4 due to the replication across the nodes on top of the > >> nodes being RAID'd themselves. > >That really depends on your setup. I'm not sure what you mean by the nodes > >being raided themselves. > I think he meant gluster "RAID1" plus hardware RAID (10 I guess from the > x2, instead of standalone disks). > > JD > _______________________________________________ > CentOS mailing list > CentOS@xxxxxxxxxx > http://lists.centos.org/mailman/listinfo/centos > Hello, this comment was posted on a site I administer, where I chronologically publish an archive of some CentOS (and some other distros) lists: ===== [comment] ===== A new comment on the post "Is Glusterfs Ready?" Author : Jeff Darcy (IP: 173.48.139.36 , pool-173-48-139-36.bstnma.fios.verizon.net) E-mail : jeff@xxxxxxxxxx URL : http://pl.atyp.us Whois : http://whois.arin.net/rest/ip/173.48.139.36 Comment: Hi. I'm one of the GlusterFS developers, and I'll try to offer a slightly different perspective. First, sure GlusterFS has bugs. Some of them even make me cringe. If we really wanted to get into a discussion of the things about GlusterFS that suck, I'd probably be able to come up with more things than anybody, but one of the lessons I learned early in my career is that seeing all of the bugs for a piece of software leads to a skewed perspective. Some people have had problems with GlusterFS but some people have been very happy with it, and I guarantee that every alternative has its own horror stories. GlusterFS and XtreemFS were the only two distributed filesystems that passed some *very simple* tests I ran last year. Ceph crashed. MooseFS hung (and also doesn't honor O_SYNC). OrangeFS corrupted data. HDFS cheats by buffering writes locally, and doesn't even try to implement half of the required behaviors for a general-purpose filesystem. I can go through any of those codebases and find awful bug after horrible bug after data-destroying bug . . . and yet each of them has their fans too, because most users could never possibly hit the edge conditions where those bugs exist. The lesson is that anecdotes do not equal data. Don't listen to vendor hype, and don't listen to anti-vendor bashing either. Find out what the *typical* experience across a large number of users is, and how well the software works in your own testing. Second, just as every piece of software has bugs, every truly distributed filesystem (i.e. not NFS) struggles with lots of small files. There has been some progress in this area with projects like Pomegranate and GIGA+, we have some ideas for how to approach it in GlusterFS (see my talk at SDC next week), but overall I think it's important to realize that such a workload is likely to be problematic for *any* offering in the category. You'll have to do a lot of tuning, maybe implement some special workarounds yourself, but if you want to combine this I/O profile with the benefits of scalable storage it can all be worth it. Lastly, if anybody is paying a 4x disk-space penalty (at one site) I'd say they're overdoing things. Once you have replication between servers, RAID-1 on each server is overkill. I'd say even RAID-6 is overkill. How many simultaneous disk failures do you need to survive? If the answer is two, as it usually seems to be, then GlusterFS replication on top of RAID-5 is a fine solution and requires a maximum of 3x (more typically just a bit more than 2x). In the future we're looking at various forms of compression and deduplication and erasure codes that will all bring the multiple down even further. So I can't say whether it's ready or whether you can trust it. I'm not objective enough for my opinion on that to count for much. What I'm saying is that distributed filesystems are complex pieces of sofware, none of the alternatives are where any of us working on them would like to be, and the only way any of these projects get better is if users let us know of problems they encounter. Blog posts or comments describing specific issues, from people whose names appear nowhere on any email or bug report the developers could have seen, don't help to advance the state of the art. ===== [/comment] ===== Regards, -- J. Pavel Espinal Skype: p.espinal http://ww.pavelespinal.com http://www.slackware-es.com _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos