Re: Advice for setup: SW RAID 6 vs JBOD

Eduardo Mayoral <emayoral@xxxxxxxx> · Thu, 6 Jun 2019 18:48:02 +0200

Your comment actually helps me more than you think, one of the main
doubts I have is whether I go for JOBD with replica 3 or SW RAID 6 with
replica2 + arbitrer. Before reading your email I was leaning more
towards JOBD, as reconstruction of a moderately big RAID 6 with mdadm
can be painful too. Now I see a reconstruct is going to be painful
either way...

For the record, the workload I am going to migrate is currently
18,314,445 MB and 34,752,784 inodes (which is not exactly the same as
files, but let's use that for a rough estimate), for an average file
size of about 539 KB per file.

Thanks a lot for your time and insights!

On 6/6/19 8:53, Hu Bert wrote:
> Good morning,
>
> my comment won't help you directly, but i thought i'd send it anyway...
>
> Our first glusterfs setup had 3 servers withs 4 disks=bricks (10TB,
> JBOD) each. Was running fine in the beginning, but then 1 disk failed.
> The following heal took ~1 month, with a bad performance (quite high
> IO). Shortly after the heal hat finished another disk failed -> same
> problems again. Not funny.
>
> For our new system we decided to use 3 servers with 10 disks (10 TB)
> each, but now the 10 disks in a SW RAID 10 (well, we split the 10
> disks into 2 SW RAID 10, each of them is a brick, we have 2 gluster
> volumes). A lot of disk space "wasted", with this type of SW RAID and
> a replicate 3 setup, but we wanted to avoid the "healing takes a long
> time with bad performance" problems. Now mdadm takes care of
> replicating data, glusterfs should always see "good" bricks.
>
> And the decision may depend on what kind of data you have. Many small
> files, like tens of millions? Or not that much, but bigger files? I
> once watched a video (i think it was this one:
> https://www.youtube.com/watch?v=61HDVwttNYI). Recommendation there:
> RAID 6 or 10 for small files, for big files... well, already 2 years
> "old" ;-)
>
> As i said, this won't help you directly. You have to identify what's
> most important for your scenario; as you said, high performance is not
> an issue - if this is true even when you have slight performance
> issues after a disk fail then ok. My experience so far: the bigger and
> slower the disks are and the more data you have -> healing will hurt
> -> try to avoid this. If the disks are small and fast (SSDs), healing
> will be faster -> JBOD is an option.
>
>
> hth,
> Hubert
>
> Am Mi., 5. Juni 2019 um 11:33 Uhr schrieb Eduardo Mayoral <emayoral@xxxxxxxx>:
>> Hi,
>>
>>     I am looking into a new gluster deployment to replace an ancient one.
>>
>>     For this deployment I will be using some repurposed servers I
>> already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW
>> RAID controller. They also have some SSD which would be nice to leverage
>> as cache or similar to improve performance, since it is already there.
>> Advice on how to leverage the SSDs would be greatly appreciated.
>>
>>     One of the design choices I have to make is using 3 nodes for a
>> replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW RAID
>> 6 for the disks, maybe adding a 3rd node with a smaller amount of disk
>> as metadata node for the replica set. I would love to hear advice on the
>> pros and cons of each setup from the gluster experts.
>>
>>     The data will be accessed from 4 to 6 systems with native gluster,
>> not sure if that makes any difference.
>>
>>     The amount of data I have to store there is currently 20 TB, with
>> moderate growth. iops are quite low so high performance is not an issue.
>> The data will fit in any of the two setups.
>>
>>     Thanks in advance for your advice!
>>
>> --
>> Eduardo Mayoral Jimeno
>> Systems engineer, platform department. Arsys Internet.
>> emayoral@xxxxxxxx - +34 941 620 105 - ext 2153
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Eduardo Mayoral Jimeno
Systems engineer, platform department. Arsys Internet.
emayoral@xxxxxxxx - +34 941 620 105 - ext 2153

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users