Re: Adding a monitor to

Patrick Darley <patrick.darley@xxxxxxxxxxxxxxx> · Thu, 30 Oct 2014 18:12:50 +0000

On 2014-10-30 08:23, Joao Eduardo Luis wrote:
On 10/27/2014 06:37 PM, Patrick Darley wrote:
Hi there

Over the last week or so, I've been trying to connect a ceph monitor
node running on a baserock system
to connect to a simple 3-node ubuntu ceph cluster.

The 3 node ubunutu cluster was created by following the documented 
Quick
installation guide using 3 VMs running ubuntu Trusty.

After the ubuntu cluster has been deployed I would then follow the
directions below, which I derived from comparing the ceph-deploy 
debug
information, the ceph documentation on adding monitor nodes to an
existing system and the ceph documentation on bootstrapping monitor 
nodes.

  1. scp the /etc/ceph/* from admin node
  2. create the dir: mkdir /var/lib/ceph/mon/ceph-bcc08
  3. generate mon keyring: sudo ceph auth get mon. -o
/var/lib/ceph/tmp/ceph-bcc08.mon.keyring
  4. generate monmap: sudo ceph mon getmap -o 
/var/lib/ceph/tmp/monmap
  5. That filesystem thingy: sudo ceph-mon --cluster ceph --mkfs -i
bcc08 --keyring /var/lib/ceph/tmp/ceph-bcc08.mon.keyring --monmap
/var/lib/ceph/tmp/monmap
  6. Unlink keys and old monmap: rm /var/lib/ceph/tmp/*
  7. touch things: touch /var/lib/ceph/mon/ceph-bcc08/done and touch
/var/lib/ceph/mon/ceph-bcc08/sysvinit
  8. Then start the mon: sudo /etc/init.d/ceph start mon.bcc08

I have a feeling that the issue is your ceph.conf, which you copy on
the first step.

Have you added the monitor you're adding to it?

I hadn't, I had thought that the monitors communicate via the monmap 
and
did not look at the ceph.conf so much, so I hadn't thought to change 
it.

Most commonly the initial configuration will be based on either
'mon_initial_members' and 'mon_hosts' config options, or
monitor-specific sections.  Say you initially have something like
this:

mon_initial_members = ucc01
mon_hosts = 192.168.122.95

[mon.ucc01]
host = ucc01

then you'll want to add 'bcc07' to the list before you fire up the
new monitor (possibly you'll even want to do it prior to mkfs):

mon_initial_members = ucc01,bcc07
moh_hosts = 192.168.122.95,whatever-ip-for-bcc07:port-if-not-default

[mon.ucc01]
host = ucc01

[mon.bcc07]
host = bcc07

Let me know if you still have problems after doing this.
  -Joao

I adjusted the cpeh.conf thus and pushed to all the nodes in the 
cluster at step 1
I then continued with the procedure above and when I started bcc07 it 
entered the probing state.

Checking the log of ucc01 I get the following line repeated:

      2014-10-30 18:02:42.617404 7ff6d4b2a700  1 mon.ucc01@0(leader) e1 
peer 192.168.122.42:6789/0 missing features 824633720832

And the log of bcc07 has this line repeated:

      2014-10-30 18:07:44.782907 7fb0fe22a700  1 
mon.bcc07@0(probing).paxos(paxos recovering c 0..0) is_readable 
now=2014-10-30 18:07:44.782912 lease_expire=0.000000 has v0 lc 0

This is the most communication I've seen between them so this is on the 
right track I guess!

Do you know what the problem might be here?

Thanks very much for your time!

Patrick

When I carry out these steps in the attempt to add a baserock system 
to
the ubuntu cluster, the monitor node has not been added to the 
cluster
and the admin socket mon_status gives the following output.

   ~ # ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.bcc07.asok mon_status
   { "name": "bcc07",
     "rank": -1,
     "state": "probing",
     "election_epoch": 0,
     "quorum": [],
     "outside_quorum": [],
     "extra_probe_peers": [],
     "sync_provider": [],
     "monmap": { "epoch": 0,
         "fsid": "4460079d-42f4-4e3a-8ce3-e2a7fa2685e6",
         "modified": "2014-10-27 12:37:25.531542",
         "created": "2014-10-27 12:37:25.531542",
         "mons": [
               { "rank": 0,
                 "name": "ucc01",
                 "addr": "192.168.122.95:6789\/0"}]}}

And the newly added monitor remains stuck in the probing state
indefinitely. To try and resolve
this issue I have looked at the problems monitor troubleshooting 
page of
the ceph documentation, eg. ntp sychronisation and checking network
connectivity (to the best of my ability :-s ).

It is also worth mentioning that I have created a 3 node ceph 
cluster on
baserock machines (1 mon, 2 osds) then successfully added monitor 
nodes
running baserock and ubuntu systems using the same 8 step process 
given
above.

This leaves me confused as to why adding the monitor run on baserock 
to
the all ubuntu cluster specifically is causing problems.

Are there any reasons why this 'probing' problem could be occuring? 
Im
feeling a little stuck of how to proceed and would welcome any 
suggestions.

Thanks for your help,

Patrick
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com