Re: Need help to design a data storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 09/08/16 20:43, Gandalf Corvotempesta wrote:
Il 09 ago 2016 19:57, "Ashish Pandey" <aspandey@xxxxxxxxxx
<mailto:aspandey@xxxxxxxxxx>> ha scritto:
Yes, redundant data spread across multiple servers. In my example I
mentioned 6 different nodes each with one brick.
Point is that for 4+2 you can loose any 2 bricks. It could be because
of node failure or brick failure.
1 - 6 bricks on 6 different nodes - any 2 nodes may go down - EC win

However if you have only 2 nodes and 3 bricks on each nodes, then yes
in this case even if one node goes down, ec will fail because that will
cause 3 bricks down.
In this case replica 3 would win.

6 nodes with 1 brick each is a surreal case.
A much common case is multiple nodes with multiple bricks, something
like 9 nodes with 12 bricks each. (In example,  a 2U supermicro server
with 12 disks)

In this case, EC replicas could be placed on a single server.

Not really. The disperse sets, like the replica sets, are defined when the volume is created. You must make sure that every disperse set is made of bricks from different servers. If this condition is satisfied while creating the volume, there won't be two fragments of the same file on two bricks of the same server.


And with 9*12 bricks you still have 2 single disks (or one server if
both are placed on the same hardware) as failure domains.
Yes, you'll get 9*(12-2) usable bricks and not (9*12)/3 but you risk
data loss for sure.

It's true that the probability of failure of a distributed-replicated volume is smaller than a distributed-dispersed one. However if you are considering big volumes of redundancy 2 or higher, replica gets prohibitively expensive and wastes a lot of bandwidth.

You can reduce local disk failure probability by creating bricks over a RAID5 or RAID6 if you want. It will waste more disks, but many less than a replica.


Just a question:  with EC which is the right calc method between these 3:

a)  (#servers*#bricks)-#replicas

Or

b) #servers*(#bricks - #replicas)

Or

c) (#servers-#replicas)*#bricks

In case A I'll use 2 disks as replica for the whole volume (exactly like
a raid6)

In case B I'll use 2 disks from each server as replica

in case C I'll use 2 whole servers as replica (this is the most secure
as i can loose 2 whole servers)

In fact none of these is completely correct. The redundancy level is per disperse set, not for the whole volume.

S: number of servers
D: number of disks per server
N: Disperse set size
R: Disperse redundancy

Usable disks = S * D * (1 - R / N)




_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux