Re: Cannot Create MGR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could it be a problem that I have changed the hostname after the mon creation?

What I mean is that

# hostname -s
ovhctrl


# ceph daemon mon.$(hostname -s) quorum_status
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory


But if I do it as "nefelus-controller" which is how was created initially

# ceph daemon mon.nefelus-controller quorum_status
{
    "election_epoch": 69,
    "quorum": [
        0
    ],
    "quorum_names": [
        "nefelus-controller"
    ],
    "quorum_leader_name": "nefelus-controller",
    "monmap": {
        "epoch": 2,
        "fsid": "d357a551-5b7a-4501-8d8f-009c63b2c972",
        "modified": "2018-02-28 18:49:55.985382",
        "created": "2017-03-23 22:36:56.897038",
        "features": {
            "persistent": [
                "kraken",
                "luminous"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "nefelus-controller",
                "addr": "xxx.xxx.xxx.xxx:6789/0",
                "public_addr": "xxx.xxx.xxx.xxx:6789/0"
            }
        ]
    }
}



Additionally "ceph auth list" has in every entry the [mgr] caps

G.



Hi,

looks like you haven’t run the ceph-deploy command with the same user
name and may be not the same current working directory. This could
explain your problem.

Make sure the other daemons have a mgr cap authorisation. You can
find on this ML details about MGR caps being incorrect for OSDs and
MONs after a Jewel to Luminous upgrade. The output of a ceph auth list
command should help you find out if it’s the case.

Are your ceph daemons still running? What does a ceph daemon
mon.$(hostname -s) quorum_status gives you from a MON server.

JC

On Feb 28, 2018, at 10:05, Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> wrote:


Indeed John,

you are right! I have updated "ceph-deploy" (which was installed via "pip" that's why wasn't updated with the rest ceph packages) but now it complaints that keys are missing

$ ceph-deploy mgr create controller
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/user/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy mgr create controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO ] mgr : [('controller', 'controller')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1d42bd8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO ] func : <function mgr at 0x1cce500>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts controller:controller [ceph_deploy][ERROR ] RuntimeError: bootstrap-mgr keyring not found; run 'gatherkeys'


and I cannot get the keys...



$ ceph-deploy gatherkeys controller
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/user/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.0): /usr/bin/ceph-deploy gatherkeys controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x199f290>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO ] mon : ['controller'] [ceph_deploy.cli][INFO ] func : <function gatherkeys at 0x198b2a8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpPQ895t
[controller][DEBUG ] connection detected need for sudo
[controller][DEBUG ] connected to host: controller
[controller][DEBUG ] detect platform information from remote host
[controller][DEBUG ] detect machine type
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] No mon key found in host: controller [ceph_deploy.gatherkeys][ERROR ] Failed to connect to host:controller [ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpPQ895t
[ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon




On Wed, Feb 28, 2018 at 5:21 PM, Georgios Dimitrakakis
<giorgis@xxxxxxxxxxxx> wrote:
All,

I have updated my test ceph cluster from Jewer (10.2.10) to Luminous
(12.2.4) using CentOS packages.

I have updated all packages, restarted all services with the proper order
but I get a warning that the Manager Daemon doesn't exist.

Here is the output:

# ceph -s
 cluster:
   id:     d357a551-5b7a-4501-8d8f-009c63b2c972
   health: HEALTH_WARN
           no active mgr

 services:
   mon: 1 daemons, quorum controller
   mgr: no daemons active
   osd: 2 osds: 2 up, 2 in

 data:
   pools:   0 pools, 0 pgs
   objects: 0 objects, 0 bytes
   usage:   0 kB used, 0 kB / 0 kB avail
   pgs:


While at the same time the system service is up and running

# systemctl status ceph-mgr.target
● ceph-mgr.target - ceph target allowing to start/stop all ceph-mgr@.service
instances at once
Loaded: loaded (/usr/lib/systemd/system/ceph-mgr.target; enabled; vendor
preset: enabled)
  Active: active since Wed 2018-02-28 18:57:13 EET; 12min ago


I understand that I have to add a new MGR but when I try to do it via
"ceph-deploy" it fails with the following error:


# ceph-deploy mgr create controller
usage: ceph-deploy [-h] [-v | -q] [--version] [--username USERNAME]
                  [--overwrite-conf] [--cluster NAME] [--ceph-conf
CEPH_CONF]
                  COMMAND ...
ceph-deploy: error: argument COMMAND: invalid choice: 'mgr' (choose from 'new', 'install', 'rgw', 'mon', 'mds', 'gatherkeys', 'disk', 'osd', 'admin',
'repo', 'config', 'uninstall', 'purge', 'purgedata', 'calamari',
'forgetkeys', 'pkg')

You probably have an older version of ceph-deploy, from before it knew
how to create mgr daemons.

John



where "controller" is the node where ceph monitor is already running.


Any ideas why I cannot do it via "ceph-deploy" and what do I have to do to
have it back in a healthy state?


I am running CentOS 7.4.1708 (Core).

Best,

G.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux