Indeed that was the problem!
In case anyone else ever comes to the same condition please keep in
mind that no matter what you write at the "ceph-deploy" command it will
use at some point the the output from "hostname -s" and try to connect
to gather data from that monitor.
If you have changed the hostname like I have done it will not match the
existing files with the one from the output.
Thus I had to go back for a moment and use the old hostname in order to
gather keys, deploy them and finally create the MGR. After that the
cluster came back to healthy condition.
Thank you all for your time.
Regards,
G.
Could it be a problem that I have changed the hostname after the mon
creation?
What I mean is that
# hostname -s
ovhctrl
# ceph daemon mon.$(hostname -s) quorum_status
admin_socket: exception getting command descriptions: [Errno 2] No
such file or directory
But if I do it as "nefelus-controller" which is how was created
initially
# ceph daemon mon.nefelus-controller quorum_status
{
"election_epoch": 69,
"quorum": [
0
],
"quorum_names": [
"nefelus-controller"
],
"quorum_leader_name": "nefelus-controller",
"monmap": {
"epoch": 2,
"fsid": "d357a551-5b7a-4501-8d8f-009c63b2c972",
"modified": "2018-02-28 18:49:55.985382",
"created": "2017-03-23 22:36:56.897038",
"features": {
"persistent": [
"kraken",
"luminous"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "nefelus-controller",
"addr": "xxx.xxx.xxx.xxx:6789/0",
"public_addr": "xxx.xxx.xxx.xxx:6789/0"
}
]
}
}
Additionally "ceph auth list" has in every entry the [mgr] caps
G.
Hi,
looks like you haven’t run the ceph-deploy command with the same
user
name and may be not the same current working directory. This could
explain your problem.
Make sure the other daemons have a mgr cap authorisation. You can
find on this ML details about MGR caps being incorrect for OSDs and
MONs after a Jewel to Luminous upgrade. The output of a ceph auth
list
command should help you find out if it’s the case.
Are your ceph daemons still running? What does a ceph daemon
mon.$(hostname -s) quorum_status gives you from a MON server.
JC
On Feb 28, 2018, at 10:05, Georgios Dimitrakakis
<giorgis@xxxxxxxxxxxx> wrote:
Indeed John,
you are right! I have updated "ceph-deploy" (which was installed
via "pip" that's why wasn't updated with the rest ceph packages) but
now it complaints that keys are missing
$ ceph-deploy mgr create controller
[ceph_deploy.conf][DEBUG ] found configuration file at:
/home/user/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy mgr
create controller
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] mgr :
[('controller', 'controller')]
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
<ceph_deploy.conf.cephdeploy.Conf instance at 0x1d42bd8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func :
<function mgr at 0x1cce500>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts
controller:controller
[ceph_deploy][ERROR ] RuntimeError: bootstrap-mgr keyring not
found; run 'gatherkeys'
and I cannot get the keys...
$ ceph-deploy gatherkeys controller
[ceph_deploy.conf][DEBUG ] found configuration file at:
/home/user/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy
gatherkeys controller
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
<ceph_deploy.conf.cephdeploy.Conf instance at 0x199f290>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon :
['controller']
[ceph_deploy.cli][INFO ] func :
<function gatherkeys at 0x198b2a8>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory
/tmp/tmpPQ895t
[controller][DEBUG ] connection detected need for sudo
[controller][DEBUG ] connected to host: controller
[controller][DEBUG ] detect platform information from remote host
[controller][DEBUG ] detect machine type
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] No mon key found in host:
controller
[ceph_deploy.gatherkeys][ERROR ] Failed to connect to
host:controller
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory
/tmp/tmpPQ895t
[ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon
On Wed, Feb 28, 2018 at 5:21 PM, Georgios Dimitrakakis
<giorgis@xxxxxxxxxxxx> wrote:
All,
I have updated my test ceph cluster from Jewer (10.2.10) to
Luminous
(12.2.4) using CentOS packages.
I have updated all packages, restarted all services with the
proper order
but I get a warning that the Manager Daemon doesn't exist.
Here is the output:
# ceph -s
cluster:
id: d357a551-5b7a-4501-8d8f-009c63b2c972
health: HEALTH_WARN
no active mgr
services:
mon: 1 daemons, quorum controller
mgr: no daemons active
osd: 2 osds: 2 up, 2 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs:
While at the same time the system service is up and running
# systemctl status ceph-mgr.target
● ceph-mgr.target - ceph target allowing to start/stop all
ceph-mgr@.service
instances at once
Loaded: loaded (/usr/lib/systemd/system/ceph-mgr.target;
enabled; vendor
preset: enabled)
Active: active since Wed 2018-02-28 18:57:13 EET; 12min ago
I understand that I have to add a new MGR but when I try to do it
via
"ceph-deploy" it fails with the following error:
# ceph-deploy mgr create controller
usage: ceph-deploy [-h] [-v | -q] [--version] [--username
USERNAME]
[--overwrite-conf] [--cluster NAME]
[--ceph-conf
CEPH_CONF]
COMMAND ...
ceph-deploy: error: argument COMMAND: invalid choice: 'mgr'
(choose from
'new', 'install', 'rgw', 'mon', 'mds', 'gatherkeys', 'disk',
'osd', 'admin',
'repo', 'config', 'uninstall', 'purge', 'purgedata', 'calamari',
'forgetkeys', 'pkg')
You probably have an older version of ceph-deploy, from before it
knew
how to create mgr daemons.
John
where "controller" is the node where ceph monitor is already
running.
Any ideas why I cannot do it via "ceph-deploy" and what do I have
to do to
have it back in a healthy state?
I am running CentOS 7.4.1708 (Core).
Best,
G.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com