We never ran tests with Ceph mostly due to time constraints in engineering. We also liked that, at least when I started as a novice, gluster seemed easier to set up. We use the solution in automated setup scripts for maintaining very large clusters. Simplicity in automated setup is critical here for us including automated installation of supercomputers in QE and near-automation at customer sites. We have been happy with our performance using gluster and gluster NFS for root filesystems when using squashfs object files for the NFS roots as opposed to expanded files (on a sharded volume). For writable NFS, we use XFS filesystem images on gluster NFS instead of expanded trees (in this case, not on sharded volume). We have systems running as large as 3072 nodes with 16 gluster servers (subvolumes of 3, distributed/replicate). We will have 5k nodes in production soon and will need to support 10k nodes in a year or so. So far we use CTDB for "ha-like" functionality as pacemaker is scary to us. We also have designed a second solution around gluster for high-availability head nodes (aka admin nodes). The old solution used two admin nodes, pacemaker, external shared storage, to host a VM that would start on the 2nd server if the first server died. As we know, 2-node ha is not optimal. We designed a new 3-server HA solution that eliminates the external shared storage (which was expensive) and instead uses gluster, sharded volume, and a qemu raw image hosted in the shared storage to host the virtual admin node. We use RAIDD10 4-disk per server for gluster use in this. We have been happy with the performance of this. It's only a little slower than the external shared filesystem solution (we tended to use GFS2 or OCFS or whatever it is called in the past solution). We did need to use pacemaker for this one as virtual machine availability isn't suitable for CTDB (or less natural anyway). One highlight of this solution is it allows a customer to put each of the 3 servers in a separate firewalled vault or room to keep the head alive even if there were a fire that destroyed one server. A key to our use of gluster and not suffering from poor performance in our root-filesystem-workloads is encapsulating filesystems in image files instead of using expanded trees of small files. So far we have relied on gluster NFS for the boot servers as Ganesha would crash. We haven't re-tried in several months though and owe debugging on that front. We have not had resources to put in to debugging Ganesha just yet. I sure hope Gluster stays healthy and active. It is good to have multiple solutions with various strengths out there. I like choice. Plus, choice lets us learn from each other. I hope project sponsors see that too. Erik > 17.06.2020 08:59, Artem Russakovskii пишет: > > It may be stable, but it still suffers from performance issues, which > > the team is working on. But nevertheless, I'm curious if maybe Ceph has > > those problem sorted by now. > > > Dunno, we run gluster on small clusters, kvm and gluster on the same hosts. > > There were plans to use ceph on dedicated server next year, but budget cut > because you don't want to buy our oil for $120 ;-) > > Anyway, in our tests ceph is faster, this is why we wanted to use it, but > not migrate from gluster. > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > https://lists.gluster.org/mailman/listinfo/gluster-users ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users