Re: State of Gluster project

Diego Remolina <dijuremo@xxxxxxxxx> · Wed, 24 Jun 2020 12:38:30 -0400

Been using Gluster since the 3.3.x days, been burned a few times and if it was not for the help of the community (one specific time saved big by Joe Julian), I would had not continued using it.
My main use was originally as a self hosted engine via NFS and a file server for windows clients with Samba. After a couple of years, decided to get rid of virtualization so now exclusively used as a file server. I never dared turn on sharding because of the horror stories I saw in the old days for people who turned it on, then back off and then lost VMs to corruption.

The 3.3.x to 3.7.x days were good. Burned somewhere in the post 3.7.x version. Mostly problems with samba VFS gluster and also gluster mounts for the file server. Since I stopped using VFS gluster in samba due to problems with specific files breaking and no longer working form the Windows side, I had to permanently go to using gluster mounts. So since the > 3.7.x and throughout all the 4.x versions I lived with memory leaks related to the gluster mounts, where I had to reboot servers once a month or even once every 15 days. Memory would start at around 11GB from the get go, and slowly go up till it would crash the server after eating up all 64GB of RAM in 15 to 30 days.

After finally upgrading to 6.6.x I no longer have the memory leak. I have seen many improvements in speed but really did suffer with very slow performance for a long time. Nowadays, with 6.6.x and the sam gluster mounts, memory usage stays at around 4-5GB total even after 80 days of uptime!

I can definitively echo the ease of use and setup that others have mentioned, but I really cannot vouch for the stability until now, with 6.6.x. I have not dared going back to using VFS gluster diue to the bad experience I had, so I guess I could try again and probably even get some performance gains, but why try to fix what ain't broken at the moment and make my life harder?

I have tried other solutions and they are fast (significantly faster than gluster for as a file server), reliable and stable. Unfortunately, they are not free (well, the version with automatic failover isn't free), otherwise, I would gladly give up glusterfs for MooseFS in my file server use case.

Diego

On Tue, Jun 23, 2020 at 4:13 AM Mahdi Adnan <maadnan@xxxxxxxxxxxx> wrote:
Hello Gionatan,
 Using Gluster brick in a RAID configuration might be safer and require less work from Gluster admins but, it is a waste of disk space.
Gluster bricks are replicated "assuming you're creating a distributed-replica volume" so when brick went down, it should be easy to recover it and should not affect the client's IO.
We are using JBOD in all of our Gluster setups, overall, performance is good, and replacing a brick would work "most" of the time without issues.

On Sun, Jun 21, 2020 at 8:43 PM Gionatan Danti <g.danti@xxxxxxxxxx> wrote:
Il 2020-06-21 14:20 Strahil Nikolov ha scritto:

> With  every community project ,  you are in the position  of a Betta

> Tester  - no matter Fedora,  Gluster  or CEPH. So far  ,  I had

> issues with upstream  projects only diring and immediately after

> patching  - but this is properly mitigated  with a  reasonable

> patching strategy (patch  test environment and several months later

> patch prod with the same repos).

> Enterprise  Linux breaks (and alot) having 10-times more  users and

> use  cases,  so you cannot expect to start to use  Gluster  and assume

> that a  free  peoject won't break at all.

> Our part in this project is to help the devs to create a test case for

> our workload ,  so  regressions will be reduced to minimum.

Well, this is true, and both devs & community deserve a big thanks for 

all the work done.

> In the past 2  years,  we  got 2  major  issues with VMware VSAN and 1

>  major  issue  with  a Enterprise Storage cluster (both solutions are

> quite  expensive)  - so  I always recommend proper  testing  of your

> software .

Interesting, I am almost tempted to ask you what issue you had with 

vSAN, but this is not the right mailing list ;)

> From my observations,  almost nobody  is complaining about Ganesha in

> the mailing list -> 50% are  having issues  with geo replication,20%

> are  having issues with small file performance and the rest have

> issues with very old version of gluster  -> v5 or older.

Mmm, I would swear to have read quite a few posts where the problem was 

solved by migrating away from NFS Ganesha. Still, for hyperconverged 

setup a problem remains: NFS on loopback/localhost is not 100% supported 

(or, at least, RH is not willing to declare it supportable/production 

ready [1]). A fuse mount would be the more natural way to access the 

underlying data.

> I  can't say that a  replace-brick  on a 'replica  3' volume is more

> riskier  than a rebuild  of a raid,  but I have noticed that nobody is

>  following Red Hat's  guide  to use  either:

> -  a  Raid6  of 12  Disks (2-3  TB  big)

> -  a Raid10  of  12  Disks (2-3  TB big)

> -  JBOD disks in 'replica  3' mode (i'm not sure about the size  RH

> recommends,  most probably 2-3 TB)

>  So far,  I didn' have the opportunity to run on JBODs.

For the RAID6/10 setup, I found no issues: simply replace the broken 

disk without involing Gluster at all. However, this also means facing 

the "iops wall" I described earlier for single-brick node. Going 

full-Guster with JBODs would be interesting from a performance 

standpoint, but this complicate eventual recovery from bad disks.

Does someone use Gluster in JBOD mode? If so, can you share your 

experience?

Thanks.

[1] https://access.redhat.com/solutions/22231 (accound required)

[2] https://bugzilla.redhat.com/show_bug.cgi?id=489889 (old, but I can 

not find anything newer)

-- 

Danti Gionatan

Supporto Tecnico

Assyoma S.r.l. - www.assyoma.it [1]

email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx

GPG public key ID: FF5F32A8

-- 

Mahdi Adnan
IT Manager 
Information Technology
EarthLink
VoIP: 69
Cell: 07903316180
Website: www.earthlink.iq

--

This message has been scanned for viruses and

dangerous content and is believed to be clean.

________

Community Meeting Calendar:

Schedule -

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC

Bridge: https://bluejeans.com/441850968

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users