the recommended approach is to have 4TB disks and no more than 10-12 per HW RAID.
Of course , it's not always possible but a resync of a failed 14 TB drive will take eons.
I'm not sure if the Ryzens can support ECC memory, but if they do - go for it.
In both scenarios, always align the upper layers (LVM , FS) with the stripe width and stripe size.
What kind of workload do you have ?
Best Regards,
Strahil Nikolov
On Sat, Mar 18, 2023 at 14:36, Martin Bähr<mbaehr+gluster@xxxxxxxxxx> wrote:hi,our current servers are suffering from a weird hardware issue thatforces us to start over.in short we have two servers with 15 disks at 6TB each, divided intothree raid5 arrays for three bricks per server at 22TB per brick.each brick on one server is replicated to a brick on the second server.the hardware issue is that somewhere in the backplane random I/O errorshappen when the system is under load. these cause the raid to faildisks, although the disks themselves are perfectly fine. reintegrationof the disks causes more load and is therefore difficult.we have been running these servers for at least four years, and the problemonly started appearing about three months agoour hostingprovider acknowledged the issue but does not support movingthe disks to different servers. (they replaced the hardware but thatdidn't help)so we need to start over.my first intuition was that we should have smaller servers with lessdisks to avoid repeating the above scenario.we also previously had issues with the load created by raid resync so weare considering to skip raid alltogether and rely on gluster replicationinstead. (by compensating with three replicas per brick instead of two)our options are:6 of these:AMD Ryzen 5 Pro 3600 - 6c/12t - 3.6GHz/4.2GHz32GB - 128GB RAM4 or 6 × 6TB HDD SATA6Gbit/sor three of these:AMD Ryzen 7 Pro 3700 - 8c/16t - 3.6GHz/4.4GHz32GB - 128GB RAM6× 14TB HDD SAS6Gbit/si would configure 5 bricks on each server (leaving one disk as a hotspare)the engineers prefer the second option due to the architecture and SASdisks. it is also cheaper.i am concerned that 14TB disks will take to long to heal if one ever hasto be replaced and would favor the smaller disks.the other question is, is skipping raid a good idea?greetings, martin.________Community Meeting Calendar:Schedule -Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTCGluster-users mailing list
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users