David, SUCCESS!! Thank you so much! I rebuilt the node because I could not install Jewel over the remnants of Kraken. So, while I did install Jewel I am not convinced that was the solution. I did something that I had not tried under the Kraken attempts that solved the problem. For future_me here was the solution. Removed all references to r710e from the ceph.conf on ceph-deploy node in the original deployment folder home/cephadminaccount/ceph-cluster/ceph.conf “Ceph-deploy –overwrite-conf config push r710a r710b r710c” etc to all nodes including the ceph-deploy node so it is now in the /etc/ceph/ceph.conf “Ceph-deploy install --release jewel r710e” “Ceph-deploy admin r710e” “sudo chmod +r /etc/ceph/ceph.client.admin.keyring” Run on node r710e “ceph-deploy mon create r710e” Node was created but still had the very same probing errors. Ugh. Then I went to home/cephadminaccount/ceph-cluster/ceph.conf and added r710e back in just the way it was before and pushed it to all nodes. “Ceph-deploy –overwrite-conf config push r710a r710b r710c” etc “Sudo reboot” on r710g don’t know if this was necessary. When it came up ceph -s was good. Rebooted r710e for good measure. Did not reboot r710f. I am wondering if I had just pushed the ceph.conf back out in the first place, would it have solved the problem. That is for another day. -Jim From: David Turner [mailto:drakonstein@xxxxxxxxx] You can specify an option in ceph-deploy to tell it which release of ceph to install, jewel, kraken, hammer, etc. `ceph-deploy --release jewel` would pin the command to using jewel instead of kraken. While running a mixed environment is supported, it should always be tested before assuming it will work for you in production. The Mons are quick enough to upgrade, I always do them together. Following I upgrade half of my OSDs in a test
environment and leave it there for a couple weeks (or until adequate testing is done) before upgrading the remaining OSDs and again waiting until the testing is done, I would probably do the MDS before the OSDs, but don't usually think about that since I don't
have them in a production environment. Lastly I would test upgrading the clients (vm hosts, RGW, kernel clients, etc) and test this state the most thoroughly. In production I haven't had to worry about an upgrade taking longer than a few hours with over
60 OSD nodes, 5 mons, and a dozen clients. I just don't see a need to run in a mixed environment in production, even if it is supported. Back to your problem with adding in the mon. Do your existing mons know about the third mon, or have you removed it from their running config? It might be worth double checking their config file and restarting the daemons after you know
they will pick up the correct settings. It's hard for me to help with this part as I've been lucky enough not to have any problems with the docs online for this when it's come up. I've replaced 5 mons without any issues. I didn't use ceph-deploy, except
to install the packages, though and did the manual steps for it. Hopefully adding the mon back on Jewel fixes the issue. That would be the easiest outcome. I don't know that the Ceph team has tested adding upgraded mons to an old quorum. On Wed, Jun 21, 2017 at 4:52 PM Jim Forde <jimf@xxxxxxxxx> wrote:
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com