Re: Production cluster planning

Joe Julian <joe@xxxxxxxxxxxxxxxx> · Wed, 26 Oct 2016 14:38:49 -0700

    On 10/26/2016 02:12 PM, Gandalf
      Corvotempesta wrote:

      2016-10-26 23:07 GMT+02:00 Joe Julian <joe@xxxxxxxxxxxxxxxx>:

        And yes, they can fail, but 20TB is small enough to heal pretty quickly.

      20TB small enough to build quickly? On which network? Gluster doesn't
have a dedicated cluster network, if the cluster is being hevily
accessed, the healing will slow down everything else (or everything
else will slow down the healing)

    Quickly = MTTR is within tolerances to continue to meet SLA. It's
    just math.

    As for a dedicated heal network, split-horizon dns handles that just
    fine. Clients resolve a server's hostname to the "eth1" (for
    example) address and the servers themselves resolve the same
    hostname to the "eth0" address. We played with bonding but decided
    against the complexity.

Anyway, you can heal quickly, but I still prefere to have data safe on
each node. If you start with 3 server at once, probably each disk is
coming from the same batch, thus a massive disks failure is easy to
get.

    There's preference and there's engineering to meet requirements. If
    your SLA is 5 nines and you engineer 6 nines, you may realize that
    the difference between a 99.99993% uptime and a 99.99997% uptime
    isn't worth the added expense of doing replication and
    raid-1.

      If you loose only 2 disks, one for each server, from the same replica
group, you are game over. With RAID6, you have to loose 5 disks from
the same replica group.

    I never loose my drives. They're always firmly attached. :P

    With 300 drives, 60 bricks, replica 3 (across 3 racks), I have a six
    nines availability for any one replica subvolume. If you really want
    to fudge the numbers, the reliability for any given file is not
    worth calculating in that volume. The odds of all three bricks
    failing for any 1 file among 20 distribute subvolumes is
    statistically infinitesimal.

In my environment, I can create 4 RAID-0 on each server (3 disks on
each RAID0), or 2 RAID-6 with 6 disks each, or 1 RAID-6 with 12 disks
or 1 RAID-7 with 12 disks (RAID-7 with less than 12 disks is
non-sense)
I don't know which one is better.

    Just do the reliability calculations and engineer a storage system
    to meet (exceed) your obligations within the available budget.
http://www.eventhelix.com/realtimemantra/faulthandling/system_reliability_availability.htm

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users