Re: New GlusterFS deployment, doubts on 1 brick per host vs 1 brick per drive.

Miguel Mascarenhas Filipe <miguel.filipe@xxxxxxxxx> · Thu, 10 Sep 2020 21:26:00 +0100

hi,
thanks both for the replies

On Thu, 10 Sep 2020 at 16:08, Darrell Budic <budic@xxxxxxxxxxxxxxxx> wrote:
I run ZFS on my servers (with additional RAM for that reason) in my replica-3 production cluster. I choose size and ZFS striping of HDDs, with easier compression and ZFS controlled caching using SDDs for my workload (mainly VMs). It performs as expected, but I don’t have the resources to head-to-head it against a bunch of individual bricks for testing. One drawback to large bricks is that it can take longer to heal, in my experience. I also run some smaller volumes on SSDs for VMs with databases and other high-IOPS workloads, and for those I use tuned XFS volumes because I didn’t want compression and did want faster healing.

With the options you’ve presented, I’d use XFS on single bricks, there’s not much need for the overhead unless you REALLY want ZFS compression, and ZFS if you wanted higher performing volumes, mirrors, or had some cache to take advantage of. Or you knew your workload could take advantage of the things ZFS is good at, like setting specific record sizes tuned to your work load on sub-volumes. But that depends on how you’re planning to consume your storage, as file shares or as disk images. The ultimate way to find out, of course, is to test each configuration and see which gives you the most of what you want :)

yes, zfs (or btrfs ) was for compression but also for the added robustness provided by checksums. I didnt mention btrfs but i’m confortable with btrfs for simple volumes with compression.. but i imagine there isnt a large user base of glusterfs + btrfs.

this is a mostly cold dataset with lots of uncompressed training data for ML.

there is one argument for bit fat internally redundant (zfs) brick which is: 
 there is more widespread knowledge on how to manage failed drives on zfs..
one of the inputs i was seeking due to my inexperience with glusterfs is this management side.
i didnt see on the docs how to add spare drives or what happens when a brick dies.. what type of healing exists.. if for example there isnt a replacement drive..

And definitely get a 3rd server in there with at least enough storage to be an arbiter. At the level you’re talking, I’d try and deck it out properly and have 3 active hosts off the bat so you can have a proper redundancy scheme. Split brain more than sucks.

agreed, im aware of split brain. will add additional nodes asap, it is already planned.

 -Darrell

> On Sep 10, 2020, at 1:33 AM, Diego Zuccato <diego.zuccato@xxxxxxxx> wrote:

> 

> Il 09/09/20 15:30, Miguel Mascarenhas Filipe ha scritto:

> 

> I'm a noob, but IIUC this is the option giving the best performance:

> 

>> 2. 1 brick per drive, Gluster "distributed replicated" volumes, no

>> internal redundancy

> 

> Clients can write to both servers in parallel and read scattered (read

> performance using multiple files ~ 16x vs 2x with a single disk per

> host). Moreover it's easier to extend.

> But why ZFS instead of XFS ? In my experience it's heavier.

> 

> PS: add a third host ASAP, at least for arbiter volumes (replica 3

> arbiter 1). Split brain can be a real pain to fix!

> 

> -- 

> Diego Zuccato

> DIFA - Dip. di Fisica e Astronomia

> Servizi Informatici

> Alma Mater Studiorum - Università di Bologna

> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy

> tel.: +39 051 20 95786

> ________

> 

> 

> 

> Community Meeting Calendar:

> 

> Schedule -

> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

> Bridge: https://bluejeans.com/441850968

> 

> Gluster-users mailing list

> Gluster-users@xxxxxxxxxxx

> https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Miguel Mascarenhas Filipe
________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users