Re: MDS stuck in 'rejoin' after network fragmentation caused OSD flapping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Sun, Aug 19, 2018 at 9:29 AM David Turner <drakonstein@xxxxxxxxx> wrote:
I second that you do not have nearly enough RAM in these servers and I don't you have at least 72 CPU cores either which means you again don't have the minimum recommendation for the amount of OSDs you have, let alone everything else.  I would suggest you start by moving your MDS daemons off of these nodes as they'll be there most hungry and problematic of the remaining services.  It would also probably make sense to just move the mon, and mgr daemons to the new host as well.

On Sun, Aug 19, 2018, 8:01 AM Christian Wuerdig <christian.wuerdig@xxxxxxxxx> wrote:
It should be added though that you're running at only 1/3 of the
recommended RAM usage for the OSD setup alone - not to mention that
you also co-host MON, MGR and MDS deamons on there. The next time you
run into an issue - in particular with OSD recovery - you may be in a
pickle again and then it might not be so easy to get going.

Thanks. We have plenty of cores on these nodes. They were originally intended to expand an existing large gluster cluster, but due to serious issues we experienced doing that with another set of nodes, and with huge performance benefits with ceph testing under our workload, we decided to spin these up as a new ceph cluster. Running the service daemons co-located was necessary in this initial configuration. As we migrate data stored in gluster to this new cluster, we will be migrating those nodes to the ceph cluster. Most of those have more RAM than OSDs, so the plan is to farm out those service daemons to the nodes without as much OSD load. 

Jonathan

--
Sent from my Commodore64
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux