Re: Cannot Create MGR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




I am still trying to figure what is the problem here...

Initially the cluster was updated ok...

# ceph health detail
HEALTH_WARN noout flag(s) set; all OSDs are running luminous or later but require_osd_release < luminous; no active mgr
noout flag(s) set
all OSDs are running luminous or later but require_osd_release < luminous


While I removed the "noout" flag and set the "ceph osd require-osd-release luminous" I had on another window and was following the status



# ceph -w
  cluster:
    id:     d357a551-5b7a-4501-8d8f-009c63b2c972
    health: HEALTH_WARN
all OSDs are running luminous or later but require_osd_release < luminous
            no active mgr

  services:
    mon: 1 daemons, quorum nefelus-controller
    mgr: no daemons active
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   11 pools, 152 pgs
    objects: 9754 objects, 33754 MB
    usage:   67495 MB used, 3648 GB / 3714 GB avail
    pgs:     152 active+clean


2018-02-28 19:03:20.105027 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:03:24.101868 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:03:25.103605 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:03:29.815572 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 2671 B/s rd, 89 op/s 2018-02-28 19:03:34.105263 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 4472 B/s rd, 240 op/s 2018-02-28 19:03:35.108174 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 9020 B/s rd, 538 op/s 2018-02-28 19:03:39.104781 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 7598 B/s rd, 453 op/s 2018-02-28 19:03:40.108741 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 9020 B/s rd, 538 op/s 2018-02-28 19:03:44.105574 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 7598 B/s rd, 453 op/s 2018-02-28 19:03:45.107522 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 6696 B/s rd, 471 op/s 2018-02-28 19:03:49.106530 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail; 3958 B/s rd, 269 op/s 2018-02-28 19:03:50.110731 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:03:54.107816 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:03:55.109359 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:03:59.108575 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:00.110692 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:04.109099 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:05.111035 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:09.110238 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:10.112094 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:14.111468 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:15.113370 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:19.112223 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:20.116135 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:24.113174 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:25.114808 mon.nefelus-controller [INF] pgmap 152 pgs: 152 active+clean; 33754 MB data, 67495 MB used, 3648 GB / 3714 GB avail 2018-02-28 19:04:28.172510 mon.nefelus-controller [INF] setting require_min_compat_client to currently required firefly 2018-02-28 19:04:33.243221 osd.0 [INF] 4.6 scrub updated num_legacy_snapsets from 14 -> 0
^C#

After some time and since I wasn't getting any input I stopped it "CTRL-C" I am very curious what are the above last lines about the "firefly" and the "num_legacy_snapsets" and if they mean something bad.


After that any further attempts shown that my pools, PGs, data etc. were gone...

# ceph -w
  cluster:
    id:     d357a551-5b7a-4501-8d8f-009c63b2c972
    health: HEALTH_WARN
            no active mgr

  services:
    mon: 1 daemons, quorum nefelus-controller
    mgr: no daemons active
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:


I would really need some help here to put everything back online...

Thanks,

G.



OK...now this is getting crazy...


  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:



Where has gone everything??

What's happening here?


G.

Indeed John,

you are right! I have updated "ceph-deploy" (which was installed via
"pip" that's why wasn't updated with the rest ceph packages) but now
it complaints that keys are missing

$ ceph-deploy mgr create controller
[ceph_deploy.conf][DEBUG ] found configuration file at:
/home/user/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.0): /usr/bin/ceph-deploy mgr
create controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           :
[('controller', 'controller')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :
<ceph_deploy.conf.cephdeploy.Conf instance at 0x1d42bd8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function
mgr at 0x1cce500>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts
controller:controller
[ceph_deploy][ERROR ] RuntimeError: bootstrap-mgr keyring not found;
run 'gatherkeys'


and I cannot get the keys...



$ ceph-deploy gatherkeys controller
[ceph_deploy.conf][DEBUG ] found configuration file at:
/home/user/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.0): /usr/bin/ceph-deploy
gatherkeys controller
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       :
<ceph_deploy.conf.cephdeploy.Conf instance at 0x199f290>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO ] mon : ['controller']
[ceph_deploy.cli][INFO  ]  func                          : <function
gatherkeys at 0x198b2a8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory
/tmp/tmpPQ895t
[controller][DEBUG ] connection detected need for sudo
[controller][DEBUG ] connected to host: controller
[controller][DEBUG ] detect platform information from remote host
[controller][DEBUG ] detect machine type
[controller][DEBUG ] get remote short hostname
[controller][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] No mon key found in host: controller [ceph_deploy.gatherkeys][ERROR ] Failed to connect to host:controller [ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpPQ895t
[ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon




On Wed, Feb 28, 2018 at 5:21 PM, Georgios Dimitrakakis
<giorgis@xxxxxxxxxxxx> wrote:
All,

I have updated my test ceph cluster from Jewer (10.2.10) to Luminous
(12.2.4) using CentOS packages.

I have updated all packages, restarted all services with the proper order
but I get a warning that the Manager Daemon doesn't exist.

Here is the output:

# ceph -s
  cluster:
    id:     d357a551-5b7a-4501-8d8f-009c63b2c972
    health: HEALTH_WARN
            no active mgr

  services:
    mon: 1 daemons, quorum controller
    mgr: no daemons active
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:


While at the same time the system service is up and running

# systemctl status ceph-mgr.target
● ceph-mgr.target - ceph target allowing to start/stop all ceph-mgr@.service
instances at once
Loaded: loaded (/usr/lib/systemd/system/ceph-mgr.target; enabled; vendor
preset: enabled)
   Active: active since Wed 2018-02-28 18:57:13 EET; 12min ago


I understand that I have to add a new MGR but when I try to do it via
"ceph-deploy" it fails with the following error:


# ceph-deploy mgr create controller
usage: ceph-deploy [-h] [-v | -q] [--version] [--username USERNAME] [--overwrite-conf] [--cluster NAME] [--ceph-conf
CEPH_CONF]
                   COMMAND ...
ceph-deploy: error: argument COMMAND: invalid choice: 'mgr' (choose from 'new', 'install', 'rgw', 'mon', 'mds', 'gatherkeys', 'disk', 'osd', 'admin',
'repo', 'config', 'uninstall', 'purge', 'purgedata', 'calamari',
'forgetkeys', 'pkg')

You probably have an older version of ceph-deploy, from before it knew
how to create mgr daemons.

John



where "controller" is the node where ceph monitor is already running.


Any ideas why I cannot do it via "ceph-deploy" and what do I have to do to
have it back in a healthy state?


I am running CentOS 7.4.1708 (Core).

Best,

G.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux