monitor can not rejoin the cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

my monitor3 is not able to rejoin the cluster (containing mon1, mon2 and mon3 - running stable emperor).
I try to recreate/inject a new monmap to all 3 mon's - but only mon1 and mon2 are up and joined.

Now, enabling debugging on mon3, I got the following:

2014-01-30 08:51:03.823669 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3 handle_probe_reply mon.1 192.168.135.32:6789/0mon_probe(reply c7b12656-15a6-41b0-963f-4f47c62497dc name ceph-mon2 quorum 0,1 paxos( fc 1 lc 160 )) v5
2014-01-30 08:51:03.823678 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  monmap is e3: 3 mons at {mon.ceph-mon1=192.168.135.31:6789/0,mon.ceph-mon2=192.168.135.32:6789/0,mon.ceph-mon3=192.168.135.33:6789/0}
2014-01-30 08:51:03.823701 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  peer name is mon.ceph-mon2
2014-01-30 08:51:03.823706 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  existing quorum 0,1
2014-01-30 08:51:03.823708 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  peer paxos version 160 vs my version 154 (ok)
2014-01-30 08:51:03.823711 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  ready to join, but i'm not in the monmap or my addr is blank, trying to join

But why mon3 ("but i'm not in the monmap") is not in the monmap ?

Checking the sources https://github.com/ceph/ceph/blob/emperor/src/mon/Monitor.cc
-->         if (monmap->contains(name) &&
-->             !monmap->get_addr(name).is_blank_ip()) {
              // i'm part of the cluster; just initiate a new election
              start_election();
            } else {
              dout(10) << " ready to join, but i'm not in the monmap or my addr is blank, trying to join" << dendl;
              messenger->send_message(new MMonJoin(monmap->fsid, name, messenger->get_myaddr()),
                            monmap->get_inst(*m->quorum.begin()));
            }

My map on mon3 looks like

root@ceph-mon3:/var/log/ceph# ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon3.asok mon_status
{ "name": "ceph-mon3",
  "rank": 2,
  "state": "probing",
  "election_epoch": 0,
  "quorum": [],
  "outside_quorum": [],
  "extra_probe_peers": [],
  "sync_provider": [],
  "monmap": { "epoch": 3,
      "fsid": "c7b12656-15a6-41b0-963f-4f47c62497dc",
      "modified": "2014-01-30 08:27:28.808771",
      "created": "2014-01-30 08:27:28.808771",
      "mons": [
            { "rank": 0,
              "name": "mon.ceph-mon1",
              "addr": "192.168.135.31:6789\/0"},
            { "rank": 1,
              "name": "mon.ceph-mon2",
              "addr": "192.168.135.32:6789\/0"},
            { "rank": 2,
              "name": "mon.ceph-mon3",
              "addr": "192.168.135.33:6789\/0"}]}}


So, the condition "(monmap->contains(name) && !monmap->get_addr(name).is_blank_ip())" should work, or ? But the start_election() is not called.

Can somebody help me here ?

regards
Danny

More infos to mon3:

root@ceph-mon3:/var/log/ceph# hostname -a
	ceph-mon3

root@ceph-mon3:/var/log/ceph# netstat -tulpen | grep ceph-mon
	tcp        0      0 192.168.135.33:6789     0.0.0.0:*               LISTEN      0          635369      2164/ceph-mon   

root@ceph-mon3:/var/log/ceph# cat /etc/hosts
	127.0.0.1       localhost
	192.168.135.33  ceph-mon3.dtnet.de      ceph-mon3

admin@ceph-admin:~/cluster1$ ceph -s
    cluster c7b12656-15a6-41b0-963f-4f47c62497dc
     health HEALTH_WARN 192 pgs degraded; 192 pgs stale; 192 pgs stuck stale; 192 pgs stuck unclean; 1 mons down, quorum 0,1 ceph-mon1,ceph-mon2
     monmap e3: 3 mons at {ceph-mon1=192.168.135.31:6789/0,ceph-mon2=192.168.135.32:6789/0,ceph-mon3=192.168.135.33:6789/0}, election epoch 230, quorum 0,1 ceph-mon1,ceph-mon2
     osdmap e28: 1 osds: 1 up, 1 in
      pgmap v38: 192 pgs, 3 pools, 0 bytes data, 0 objects
            36388 kB used, 3724 GB / 3724 GB avail
                 192 stale+active+degraded

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux