Re: Cannot Create MGR

Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> · Thu, 01 Mar 2018 01:05:09 +0200

Indeed that was the problem!

In case anyone else ever comes to the same condition please keep in 
mind that no matter what you write at the "ceph-deploy" command it will 
use at some point the the output from "hostname -s" and try to connect 
to gather data from that monitor.
If you have changed the hostname like I have done it will not match the 
existing files with the one from the output.
Thus I had to go back for a moment and use the old hostname in order to 
gather keys, deploy them and finally create the MGR. After that the 
cluster came back to healthy condition.

Thank you all for your time.

Regards,

G.

Could it be a problem that I have changed the hostname after the mon
creation?

What I mean is that

# hostname -s
ovhctrl

# ceph daemon mon.$(hostname -s) quorum_status
admin_socket: exception getting command descriptions: [Errno 2] No
such file or directory

But if I do it as "nefelus-controller" which is how was created 
initially

# ceph daemon mon.nefelus-controller quorum_status
{
    "election_epoch": 69,
    "quorum": [
        0
    ],
    "quorum_names": [
        "nefelus-controller"
    ],
    "quorum_leader_name": "nefelus-controller",
    "monmap": {
        "epoch": 2,
        "fsid": "d357a551-5b7a-4501-8d8f-009c63b2c972",
        "modified": "2018-02-28 18:49:55.985382",
        "created": "2017-03-23 22:36:56.897038",
        "features": {
            "persistent": [
                "kraken",
                "luminous"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "nefelus-controller",
                "addr": "xxx.xxx.xxx.xxx:6789/0",
                "public_addr": "xxx.xxx.xxx.xxx:6789/0"
            }
        ]
    }
}

Additionally "ceph auth list" has in every entry the [mgr] caps

G.

Hi,

looks like you haven’t run the ceph-deploy command with the same 
user
name and may be not the same current working directory. This could
explain your problem.

Make sure the other daemons have a mgr cap authorisation. You can
find on this ML details about MGR caps being incorrect for OSDs and
MONs after a Jewel to Luminous upgrade. The output of a ceph auth 
list
command should help you find out if it’s the case.

Are your ceph daemons still running? What does a ceph daemon
mon.$(hostname -s) quorum_status gives you from a MON server.

JC

On Feb 28, 2018, at 10:05, Georgios Dimitrakakis 
<giorgis@xxxxxxxxxxxx> wrote:

Indeed John,

you are right! I have updated "ceph-deploy" (which was installed 
via "pip" that's why wasn't updated with the rest ceph packages) but 
now it complaints that keys are missing

$ ceph-deploy mgr create controller
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/user/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.0): /usr/bin/ceph-deploy mgr 
create controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           : 
[('controller', 'controller')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
<ceph_deploy.conf.cephdeploy.Conf instance at 0x1d42bd8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : 
<function mgr at 0x1cce500>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts 
controller:controller
[ceph_deploy][ERROR ] RuntimeError: bootstrap-mgr keyring not 
found; run 'gatherkeys'

and I cannot get the keys...

$ ceph-deploy gatherkeys controller
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/user/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.0): /usr/bin/ceph-deploy 
gatherkeys controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
<ceph_deploy.conf.cephdeploy.Conf instance at 0x199f290>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : 
['controller']
[ceph_deploy.cli][INFO  ]  func                          : 
<function gatherkeys at 0x198b2a8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory 
/tmp/tmpPQ895t
[controller][DEBUG ] connection detected need for sudo
[controller][DEBUG ] connected to host: controller
[controller][DEBUG ] detect platform information from remote host
[controller][DEBUG ] detect machine type
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] No mon key found in host: 
controller
[ceph_deploy.gatherkeys][ERROR ] Failed to connect to 
host:controller
[ceph_deploy.gatherkeys][INFO  ] Destroy temp directory 
/tmp/tmpPQ895t
[ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon

On Wed, Feb 28, 2018 at 5:21 PM, Georgios Dimitrakakis
<giorgis@xxxxxxxxxxxx> wrote:
All,

I have updated my test ceph cluster from Jewer (10.2.10) to 
Luminous
(12.2.4) using CentOS packages.

I have updated all packages, restarted all services with the 
proper order
but I get a warning that the Manager Daemon doesn't exist.

Here is the output:

# ceph -s
 cluster:
   id:     d357a551-5b7a-4501-8d8f-009c63b2c972
   health: HEALTH_WARN
           no active mgr

 services:
   mon: 1 daemons, quorum controller
   mgr: no daemons active
   osd: 2 osds: 2 up, 2 in

 data:
   pools:   0 pools, 0 pgs
   objects: 0 objects, 0 bytes
   usage:   0 kB used, 0 kB / 0 kB avail
   pgs:

While at the same time the system service is up and running

# systemctl status ceph-mgr.target
● ceph-mgr.target - ceph target allowing to start/stop all 
ceph-mgr@.service
instances at once
  Loaded: loaded (/usr/lib/systemd/system/ceph-mgr.target; 
enabled; vendor
preset: enabled)
  Active: active since Wed 2018-02-28 18:57:13 EET; 12min ago

I understand that I have to add a new MGR but when I try to do it 
via
"ceph-deploy" it fails with the following error:

# ceph-deploy mgr create controller
usage: ceph-deploy [-h] [-v | -q] [--version] [--username 
USERNAME]
                  [--overwrite-conf] [--cluster NAME] 
[--ceph-conf
CEPH_CONF]
                  COMMAND ...
ceph-deploy: error: argument COMMAND: invalid choice: 'mgr' 
(choose from
'new', 'install', 'rgw', 'mon', 'mds', 'gatherkeys', 'disk', 
'osd', 'admin',
'repo', 'config', 'uninstall', 'purge', 'purgedata', 'calamari',
'forgetkeys', 'pkg')

You probably have an older version of ceph-deploy, from before it 
knew
how to create mgr daemons.

John

where "controller" is the node where ceph monitor is already 
running.

Any ideas why I cannot do it via "ceph-deploy" and what do I have 
to do to
have it back in a healthy state?

I am running CentOS 7.4.1708 (Core).

Best,

G.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com