Re: 1 particular ceph-mon never jobs on 0.67.2

Sage Weil <sage@xxxxxxxxxxx> · Mon, 26 Aug 2013 11:31:58 -0700 (PDT)

On Mon, 26 Aug 2013, Travis Rhoden wrote:
> Hi Sage,
> 
> Thanks for the response.  I noticed that as well, and suspected
> hostname/DHCP/DNS shenanigans.  What's weird is that all nodes are
> identically configured.  I also have monitors running on n0 and n12, and
> they come up fine, every time.
> 
> Here's the mon_host line from ceph.conf:
> 
> mon_initial_members = n0, n12, n24
> mon_host = 10.0.1.0,10.0.1.12,10.0.1.24
> 
> just to test /etc/hosts and name resolution...
> 
> root@n24:~# getent hosts n24
> 10.0.1.24       n24
> root@n24:~# hostname -s
> n24
> 
> The only loopback device in /etc/hosts is "127.0.0.1       localhost", so
> that should be fine. 
> 
> Upon rebooting this node, I've had the monitor come up okay once, maybe out
> of 12 tries.  So it appears to be some kind of race...  No clue what is
> going on.  If I stop and start the monitor (or restart), it doesn't appear
> to change anything.
> 
> However, on the topic of races, I having one other more pressing issue. 
> Each OSD host is having it's hostname assigned via DHCP.  Until that
> assignment is made (during init), the hostname is "localhost", and then it
> switches over to "n<x>", for some node number.  The issue I am seeing is
> that there is a race between this hostname assignment and the Ceph Upstart
> scripts, such that sometimes ceph-osd starts while the hostname is still
> 'localhost'.  This then causes the osd location to change in the crushmap,
> which is going to be a very bad thing.  =)  When rebooting all my nodes at
> once (there are several dozen), about 50% move from being under n<x> to
> localhost.  Restarting all the ceph-osd jobs moves them back (because the
> hostname is defined).
> 
> I'm wondering what kind of delay, or additional "start-on" logic I can add
> to the upstart script to work around this.

Hmm, this is beyond my upstart-fu, unfortunately.  This has come up 
before, actually.  Previously we would wait for any interface to come up 
and then start, but that broke with multi-nic machines, and I ended up 
just making things start in runlevel [2345].

James, do you know what should be done to make the job wait for *all*
network interfaces to be up?  Is that even the right solution here?

sage

> 
> 
> On Fri, Aug 23, 2013 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
>       Hi Travis,
> 
>       On Fri, 23 Aug 2013, Travis Rhoden wrote:
>       > Hey folks,
>       >
>       > I've just done a brand new install of 0.67.2 on a cluster of
>       Calxeda nodes.
>       >
>       > I have one particular monitor that number joins the quorum
>       when I restart
>       > the node.  Looks to  me like it has something to do with the
>       "create-keys"
>       > task, which never seems to finish:
>       >
>       > root      1240     1  4 13:03 ?        00:00:02
>       /usr/bin/ceph-mon
>       > --cluster=ceph -i n24 -f
>       > root      1244     1  0 13:03 ?        00:00:00
>       /usr/bin/python
>       > /usr/sbin/ceph-create-keys --cluster=ceph -i n24
>       >
>       > I don't see that task on my other monitors.  Additionally,
>       that task is
>       > periodically query the monitor status:
>       >
>       > root      1240     1  2 13:03 ?        00:00:02
>       /usr/bin/ceph-mon
>       > --cluster=ceph -i n24 -f
>       > root      1244     1  0 13:03 ?        00:00:00
>       /usr/bin/python
>       > /usr/sbin/ceph-create-keys --cluster=ceph -i n24
>       > root      1982  1244 15 13:04 ?        00:00:00
>       /usr/bin/python
>       > /usr/bin/ceph --cluster=ceph
>       --admin-daemon=/var/run/ceph/ceph-mon.n24.asok
>       > mon_status
>       >
>       > Checking that status myself, I see:
>       >
>       > # ceph --cluster=ceph
>       --admin-daemon=/var/run/ceph/ceph-mon.n24.asok
>       > mon_status
>       > { "name": "n24",
>       >   "rank": 2,
>       >   "state": "probing",
>       >   "election_epoch": 0,
>       >   "quorum": [],
>       >   "outside_quorum": [
>       >         "n24"],
>       >   "extra_probe_peers": [],
>       >   "sync_provider": [],
>       >   "monmap": { "epoch": 2,
>       >       "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682",
>       >       "modified": "2013-08-23 12:55:34.374650",
>       >       "created": "0.000000",
>       >       "mons": [
>       >             { "rank": 0,
>       >               "name": "n0",
>       >               "addr": "10.0.1.0:6789\/0"},
>       >             { "rank": 1,
>       >               "name": "n12",
>       >               "addr": "10.0.1.12:6789\/0"},
>       >             { "rank": 2,
>       >               "name": "n24",
>       >               "addr": "0.0.0.0:6810\/0"}]}}
>                         ^^^^^^^^^^^^^^^^^^^^
> 
> This is the problem.  I can't remember exactly what causes this,
> though.
> Can you verify the host in ceph.conf mon_host line matches the ip that
> is
> configured on th machine, and that the /etc/hsots on the machine
> doesn't
> have a loopback address on it.
> 
> Thanks!
> sage
> 
> 
> 
> 
> >
> > Any ideas what is going on here?  I don't see anything useful in
> > /var/log/ceph/ceph-mon.n24.log
> >
> >  Thanks,
> >
> >  - Travis
> >
> >
> 
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com