Best Regards,
Strahil Nikolov
On Tue, Mar 14, 2023 at 16:44, Diego Zuccato<diego.zuccato@xxxxxxxx> wrote:Hello all.Our Gluster 9.6 cluster is showing increasing problems.Currently it's composed of 3 servers (2x Intel Xeon 4210 [20 cores dualthread, total 40 threads], 192GB RAM, 30x HGST HUH721212AL5200 [12TB]),configured in replica 3 arbiter 1. Using Debian packages from Gluster9.x latest repository.Seems 192G RAM are not enough to handle 30 data bricks + 15 arbiters andI often had to reload glusterfsd because glusterfs processed got killedfor OOM.On top of that, performance have been quite bad, especially when wereached about 20M files. On top of that, one of the servers have hadmobo issues that resulted in memory errors that corrupted some bricks fs(XFS, it required "xfs_reparir -L" to fix).Now I'm getting lots of "stale file handle" errors and other errors(like directories that seem empty from the client but still containingfiles in some bricks) and auto healing seems unable to complete.Since I can't keep up continuing to manually fix all the issues, I'mthinking about backup+destroy+recreate strategy.I think that if I reduce the number of bricks per server to just 5(RAID1 of 6x12TB disks) I might resolve RAM issues - at the cost oflonger heal times in case a disk fails. Am I right or it's useless?Other recommendations?Servers have space for another 6 disks. Maybe those could be used forsome SSDs to speed up access?TIA.--Diego ZuccatoDIFA - Dip. di Fisica e AstronomiaServizi InformaticiAlma Mater Studiorum - Università di BolognaV.le Berti-Pichat 6/2 - 40127 Bologna - Italytel.: +39 051 20 95786________Community Meeting Calendar:Schedule -Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTCGluster-users mailing list
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users