Hi,
After a couple months of almost no issues, our Ceph cluster has started to have frequent failures. Just this week it's failed about three times.
The issue appears to be than an MDS or Monitor will fail and then all clients hang. After that, all clients need to be forcibly restarted.
Has anyone else run into this or have any suggestions on how to remedy it?
The architecture for our setup is: 3 ea MON, MDS instances (co-located) on 2cpu, 4GB RAM servers 12 ea OSDs (ssd), on 1cpu, 1GB RAM servers
Ceph v10.2.5 Clients connect via CephFS Kernel driver.
I'd also like to note I'm relatively new to Ceph and I'm here on behalf of the person who set the cluster up, so any information is appreciated.
Thank you for your time, Rich |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com