> > The stuff I work on doesn't use containers much (unlike a different > > system also at HPE). > By "pods" I meant "glusterd instance", a server hosting a collection of > bricks. Oh ok. The term is overloaded in my world. > > I don't have a recipe, they've just always been beefy enough for > > gluster. Sorry I don't have a more scientific answer. > Seems that 64GB RAM are not enough for a pod with 26 glusterfsd > instances and no other services (except sshd for management). What do > you mean by "beefy enough"? 128GB RAM or 1TB? We are currently using replica-3 but may also support replica-5 in the future. So if you had 24 leaders like HLRS, there would be 8 replica-3 at the bottom layer, and then distributed across. (replicated/distributed volumes) So we would have 24 leader nodes, each leader would have a disk serving 4 bricks (one of which is simply a lock FS for CTDB, one is sharded, one is for logs, and one is heavily optimized for non-object expanded tree NFS). The term "disk" is loose. So each SU Leader (or gluster server) serving the 4 volumes, 8x3 configuration, in our world has some differences in CPU type and memory and storage depending on order and preferences and timing (things always move forward). On an SU Leader, we typically do 2 RAID10 volumes with a RAID controller including cache. However, we have moved to RAID1 in some cases with better disks. Leaders store a lot of non-gluster stuff on "root" and then gluster has a dedicated disk/LUN. We have been trying to improve our helper tools to 100% wheel out a bad leader (say it melted in to the floor) and replace it. Once we have that solid, and because our monitoring data on the "root" drive is already redundant, we plan to move newer servers to two NVME drives without RAID. One for gluster and one for OS. If a leader melts in to the floor, we have a procedure to discover a new node for that, install the base OS including gluster/CTDB/etc, and then run a tool to re-integrate it in to the cluster as an SU Leader node again and do the healing. Separately, monitoring data outside of gluster will heal. PS: I will note that I have a mini-SU-leader cluster on my desktop (qemu/ libvirt) for development. It is a 1x3 set of SU Leaders, one head node, and one compute node. I make an adjustment to reduce the gluster cache to fit in the memory space. Works fine. Not real fast but good enough for development. Specs of a leader node at a customer site: * 256G RAM * Storage: - MR9361-8i controller - 7681GB root LUN (RAID1) - 15.4 TB for gluster bricks (RAID10) - 6 SATA SSD MZ7LH7T6HMLA-00005 * AMD EPYC 7702 64-Core Processor - CPU(s): 128 - On-line CPU(s) list: 0-127 - Thread(s) per core: 2 - Core(s) per socket: 64 - Socket(s): 1 - NUMA node(s): 4 * Management Ethernet - Gluster and cluster management co-mingled - 2x40G (but 2x10G wouold be fine) ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users