Re: Production cluster planning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/26/2016 02:12 PM, Gandalf Corvotempesta wrote:
2016-10-26 23:07 GMT+02:00 Joe Julian <joe@xxxxxxxxxxxxxxxx>:
And yes, they can fail, but 20TB is small enough to heal pretty quickly.
20TB small enough to build quickly? On which network? Gluster doesn't
have a dedicated cluster network, if the cluster is being hevily
accessed, the healing will slow down everything else (or everything
else will slow down the healing)

Quickly = MTTR is within tolerances to continue to meet SLA. It's just math.

As for a dedicated heal network, split-horizon dns handles that just fine. Clients resolve a server's hostname to the "eth1" (for example) address and the servers themselves resolve the same hostname to the "eth0" address. We played with bonding but decided against the complexity.


Anyway, you can heal quickly, but I still prefere to have data safe on
each node. If you start with 3 server at once, probably each disk is
coming from the same batch, thus a massive disks failure is easy to
get.

There's preference and there's engineering to meet requirements. If your SLA is 5 nines and you engineer 6 nines, you may realize that the difference between a 99.99993% uptime and a 99.99997% uptime isn't worth the added expense of doing replication and raid-1.

If you loose only 2 disks, one for each server, from the same replica
group, you are game over. With RAID6, you have to loose 5 disks from
the same replica group.

I never loose my drives. They're always firmly attached. :P

With 300 drives, 60 bricks, replica 3 (across 3 racks), I have a six nines availability for any one replica subvolume. If you really want to fudge the numbers, the reliability for any given file is not worth calculating in that volume. The odds of all three bricks failing for any 1 file among 20 distribute subvolumes is statistically infinitesimal.


In my environment, I can create 4 RAID-0 on each server (3 disks on
each RAID0), or 2 RAID-6 with 6 disks each, or 1 RAID-6 with 12 disks
or 1 RAID-7 with 12 disks (RAID-7 with less than 12 disks is
non-sense)
I don't know which one is better.

Just do the reliability calculations and engineer a storage system to meet (exceed) your obligations within the available budget. http://www.eventhelix.com/realtimemantra/faulthandling/system_reliability_availability.htm

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux