RAID options for Gluster

jdarcy at redhat.com (Jeff Darcy) · Fri, 15 Jun 2012 09:04:54 -0400

On 06/15/2012 06:14 AM, Fernando Frediani (Qube) wrote:
> Going to the idea of using RAID controllers would you think that for say 16
> disks(or 12) Raid 5 would be fine  given the data is already replicated
> somewhere in another node in a very unlikely event you loose a node.

If you're already using replication, then I'd say RAID 5 is fine.  Single disk
failures will be handled at the RAID level, multiple at the GlusterFS level.

> Now in
> a node with more number of disk slots could create multiple Raid 5 logical
> volumes, but will Gluster be smart enough to not put replicated data on two
> logical volumes residing on the same node ?

GlusterFS basically doesn't make this decision; it just uses the bricks in the
order specified by the user to form first replica sets, then stripe sets on top
of those, then finally a single distribute set on top of those.  A while ago we
did add a check and warning for "bad" brick orders that would provide
inadequate fault protection, but we don't outright forbid them.

> But bottom line the maximum
> performance you get from a single file is what a single RAID logical volume
> where the file resides can do.

In theory, you can get better performance by striping.  It works in HPC because
there the benefits from parallelism can exceed the overhead of splitting and
recombining requests, but I've never seen it work out that way for any non-HPC
workload.