Re: Cannot Create MGR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Indeed that was the problem!

In case anyone else ever comes to the same condition please keep in mind that no matter what you write at the "ceph-deploy" command it will use at some point the the output from "hostname -s" and try to connect to gather data from that monitor. If you have changed the hostname like I have done it will not match the existing files with the one from the output. Thus I had to go back for a moment and use the old hostname in order to gather keys, deploy them and finally create the MGR. After that the cluster came back to healthy condition.

Thank you all for your time.

Regards,

G.

Could it be a problem that I have changed the hostname after the mon
creation?

What I mean is that

# hostname -s
ovhctrl


# ceph daemon mon.$(hostname -s) quorum_status
admin_socket: exception getting command descriptions: [Errno 2] No
such file or directory


But if I do it as "nefelus-controller" which is how was created initially

# ceph daemon mon.nefelus-controller quorum_status
{
    "election_epoch": 69,
    "quorum": [
        0
    ],
    "quorum_names": [
        "nefelus-controller"
    ],
    "quorum_leader_name": "nefelus-controller",
    "monmap": {
        "epoch": 2,
        "fsid": "d357a551-5b7a-4501-8d8f-009c63b2c972",
        "modified": "2018-02-28 18:49:55.985382",
        "created": "2017-03-23 22:36:56.897038",
        "features": {
            "persistent": [
                "kraken",
                "luminous"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "nefelus-controller",
                "addr": "xxx.xxx.xxx.xxx:6789/0",
                "public_addr": "xxx.xxx.xxx.xxx:6789/0"
            }
        ]
    }
}



Additionally "ceph auth list" has in every entry the [mgr] caps

G.



Hi,

looks like you haven’t run the ceph-deploy command with the same user
name and may be not the same current working directory. This could
explain your problem.

Make sure the other daemons have a mgr cap authorisation. You can
find on this ML details about MGR caps being incorrect for OSDs and
MONs after a Jewel to Luminous upgrade. The output of a ceph auth list
command should help you find out if it’s the case.

Are your ceph daemons still running? What does a ceph daemon
mon.$(hostname -s) quorum_status gives you from a MON server.

JC

On Feb 28, 2018, at 10:05, Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> wrote:


Indeed John,

you are right! I have updated "ceph-deploy" (which was installed via "pip" that's why wasn't updated with the rest ceph packages) but now it complaints that keys are missing

$ ceph-deploy mgr create controller
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/user/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy mgr create controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO ] mgr : [('controller', 'controller')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1d42bd8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO ] func : <function mgr at 0x1cce500>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts controller:controller [ceph_deploy][ERROR ] RuntimeError: bootstrap-mgr keyring not found; run 'gatherkeys'


and I cannot get the keys...



$ ceph-deploy gatherkeys controller
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/user/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy gatherkeys controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x199f290>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO ] mon : ['controller'] [ceph_deploy.cli][INFO ] func : <function gatherkeys at 0x198b2a8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpPQ895t
[controller][DEBUG ] connection detected need for sudo
[controller][DEBUG ] connected to host: controller
[controller][DEBUG ] detect platform information from remote host
[controller][DEBUG ] detect machine type
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] No mon key found in host: controller [ceph_deploy.gatherkeys][ERROR ] Failed to connect to host:controller [ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpPQ895t
[ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon




On Wed, Feb 28, 2018 at 5:21 PM, Georgios Dimitrakakis
<giorgis@xxxxxxxxxxxx> wrote:
All,

I have updated my test ceph cluster from Jewer (10.2.10) to Luminous
(12.2.4) using CentOS packages.

I have updated all packages, restarted all services with the proper order
but I get a warning that the Manager Daemon doesn't exist.

Here is the output:

# ceph -s
 cluster:
   id:     d357a551-5b7a-4501-8d8f-009c63b2c972
   health: HEALTH_WARN
           no active mgr

 services:
   mon: 1 daemons, quorum controller
   mgr: no daemons active
   osd: 2 osds: 2 up, 2 in

 data:
   pools:   0 pools, 0 pgs
   objects: 0 objects, 0 bytes
   usage:   0 kB used, 0 kB / 0 kB avail
   pgs:


While at the same time the system service is up and running

# systemctl status ceph-mgr.target
● ceph-mgr.target - ceph target allowing to start/stop all ceph-mgr@.service
instances at once
Loaded: loaded (/usr/lib/systemd/system/ceph-mgr.target; enabled; vendor
preset: enabled)
  Active: active since Wed 2018-02-28 18:57:13 EET; 12min ago


I understand that I have to add a new MGR but when I try to do it via
"ceph-deploy" it fails with the following error:


# ceph-deploy mgr create controller
usage: ceph-deploy [-h] [-v | -q] [--version] [--username USERNAME] [--overwrite-conf] [--cluster NAME] [--ceph-conf
CEPH_CONF]
                  COMMAND ...
ceph-deploy: error: argument COMMAND: invalid choice: 'mgr' (choose from 'new', 'install', 'rgw', 'mon', 'mds', 'gatherkeys', 'disk', 'osd', 'admin',
'repo', 'config', 'uninstall', 'purge', 'purgedata', 'calamari',
'forgetkeys', 'pkg')

You probably have an older version of ceph-deploy, from before it knew
how to create mgr daemons.

John



where "controller" is the node where ceph monitor is already running.


Any ideas why I cannot do it via "ceph-deploy" and what do I have to do to
have it back in a healthy state?


I am running CentOS 7.4.1708 (Core).

Best,

G.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux