Re: Luminous: ceph mgr crate error - mon disconnected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22/07/17 23:50, Oscar Segarra wrote:

Hi,

I have upgraded from kraken version with a simple "yum upgrade command". Later the upgrade, I'd like to deploy the mgr daemon on one node of my ceph infrastrucute.

But, for any reason, It gets stuck!

Let's see the complete set of commands:


[root@vdicnode01 ~]# ceph -s
  cluster:
    id:     656e84b2-9192-40fe-9b81-39bd0c7a3196
    health: HEALTH_WARN
*            no active mgr*

  services:
    mon: 1 daemons, quorum vdicnode01
    mgr: no daemons active
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:

[root@vdicnode01 ~]# su - vdicceph
Last login: Sat Jul 22 12:50:38 CEST 2017 on pts/0
[vdicceph@vdicnode01 ~]$ cd ceph

*[vdicceph@vdicnode01 ceph]$ ceph-deploy --username vdicceph -v mgr create vdicnode02.local* [ceph_deploy.conf][DEBUG ] found configuration file at: /home/vdicceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.38): /bin/ceph-deploy --username vdicceph -v mgr create vdicnode02.local
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : vdicceph
[ceph_deploy.cli][INFO  ]  verbose                       : True
[ceph_deploy.cli][INFO ] mgr : [('vdicnode02.local', 'vdicnode02.local')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x164f290>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO ] func : <function mgr at 0x15db848>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts vdicnode02.local:vdicnode02.local
[vdicnode02.local][DEBUG ] connection detected need for sudo
[vdicnode02.local][DEBUG ] connected to host: vdicceph@vdicnode02.local
[vdicnode02.local][DEBUG ] detect platform information from remote host
[vdicnode02.local][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: CentOS Linux 7.3.1611 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to vdicnode02.local
[vdicnode02.local][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[vdicnode02.local][DEBUG ] create path if it doesn't exist
[vdicnode02.local][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.vdicnode02.local mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-vdicnode02.local/keyring [vdicnode02.local][WARNIN] No data was received after 300 seconds, disconnecting... [vdicnode02.local][INFO ] Running command: sudo systemctl enable ceph-mgr@vdicnode02.local [vdicnode02.local][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@vdicnode02.local.service to /usr/lib/systemd/system/ceph-mgr@.service. [vdicnode02.local][INFO ] Running command: sudo systemctl start ceph-mgr@vdicnode02.local [vdicnode02.local][INFO ] Running command: sudo systemctl enable ceph.target

*[vdicceph@vdicnode01 ceph]$ sudo ceph -s --verbose --watch-warn --watch-error* parsed_args: Namespace(admin_socket=None, admin_socket_nope=None, cephconf=None, client_id=None, client_name=None, cluster=None, cluster_timeout=None, completion=False, help=False, input_file=None, output_file=None, output_format=None, status=True, verbose=True, version=False, watch=False, watch_channel='cluster', watch_debug=False, watch_error=True, watch_info=False, watch_sec=False, watch_warn=True), childargs: []

< no response for ever >

Anybody has experienced the same issue? how can I make my ceph work again?

Thanks a lot.




I've encountered this (upgrading from Jewel).

The cause seems to be a busted mgr bootstrap key (see below). Simply restarting your Ceph mons *should* get you back to functioning (mon has hung as the key is too short), then you can fix the key and deploy a mgr (here's my example for deploying a mgr on my host ceph1):

$ sudo ceph auth get client.bootstrap-mgr
exported keyring for client.bootstrap-mgr
[client.bootstrap-mgr]
        key = AAAAAAAAAAAAAAAA
        caps mon = "allow profile bootstrap-mgr"


So destroy and recreate it:


$ sudo ceph auth del client.bootstrap-mgr
updated

$ sudo ceph auth get-or-create client.bootstrap-mgr mon 'allow profile bootstrap-mgr'
[client.bootstrap-mgr]
        key = AQBDenFZW7yKJxAAYlSBQLtDADIzsnfBcdxHpg==

$ ceph-deploy -v gatherkeys ceph1
$ ceph-deploy -v mgr create ceph1


regards

Mark


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux