Hi all, I’ve a 3 node ceph cluster running on Ubuntu 14.04, dell r720xd / ceph version is 0.80.10. I have 64 Gb RAM on each node and 2 x E5-2695 v2 @ 2.40Ghz (so cat /proc/cpuinfo gives me 48 processors per node), each cpu processor is 1200 Mhz and cache size is 30720 kB. 3 mon (one one each node), 2 mds (active/backup) and 11 osd per node (no raid, 3To 7200rpm drives) with 2 Intel SSD 200Go, with journals running on SSDs. Public/cluster network is 10Gb LACP. Here is my problem : yesterday I wanted to add a brand new node in my cluster (r730xd): Ceph-deploy install newnode ..ok then Ceph-deploy osd create –zap-disk newnode:/dev/sdb:/dev/sdn Etc etc with my whole set of new disks…no problem here Ceph-deploy admin newnode My cluster became unstable, with flopping ODSs (running up and down), high load average, many blocked requests… etc. here is a snapshot of ceph –s output : https://releases.cloud-omc.fr/releases/index.php/s/5sEugMTo6KJWpIX/download I managed to get the cluster back to health ok removing every nodes one by one from the crush map and finally removing the newly added host. Did I missed something to add a new node ? Why my cluster became so unusable ? I can provide any log needed. Thank you |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com