Re: 1 particular ceph-mon never jobs on 0.67.2

Travis Rhoden <trhoden@xxxxxxxxx> · Mon, 26 Aug 2013 15:09:27 -0400

Cool.  So far I have tried:

start on (local-filesystems and net-device-up IFACE=eth0)
start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1)

About to try:

start on (local-filesystems and net-device-up IFACE=eth0 and net-device-up IFACE=eth1 and started network-services)

The "local-filesystems" + network device is billed as an alternative to runlevel if you need to to do something *after* networking...

No luck so far.  I'll keep trying things out.

On Mon, Aug 26, 2013 at 2:31 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:

On Mon, 26 Aug 2013, Travis Rhoden wrote:

> Hi Sage,

>

> Thanks for the response.  I noticed that as well, and suspected

> hostname/DHCP/DNS shenanigans.  What's weird is that all nodes are

> identically configured.  I also have monitors running on n0 and n12, and

> they come up fine, every time.

>

> Here's the mon_host line from ceph.conf:

>

> mon_initial_members = n0, n12, n24

> mon_host = 10.0.1.0,10.0.1.12,10.0.1.24

>

> just to test /etc/hosts and name resolution...

>

> root@n24:~# getent hosts n24

> 10.0.1.24       n24

> root@n24:~# hostname -s

> n24

>

> The only loopback device in /etc/hosts is "127.0.0.1       localhost", so

> that should be fine. 

>

> Upon rebooting this node, I've had the monitor come up okay once, maybe out

> of 12 tries.  So it appears to be some kind of race...  No clue what is

> going on.  If I stop and start the monitor (or restart), it doesn't appear

> to change anything.

>

> However, on the topic of races, I having one other more pressing issue. 

> Each OSD host is having it's hostname assigned via DHCP.  Until that

> assignment is made (during init), the hostname is "localhost", and then it

> switches over to "n<x>", for some node number.  The issue I am seeing is

> that there is a race between this hostname assignment and the Ceph Upstart

> scripts, such that sometimes ceph-osd starts while the hostname is still

> 'localhost'.  This then causes the osd location to change in the crushmap,

> which is going to be a very bad thing.  =)  When rebooting all my nodes at

> once (there are several dozen), about 50% move from being under n<x> to

> localhost.  Restarting all the ceph-osd jobs moves them back (because the

> hostname is defined).

>

> I'm wondering what kind of delay, or additional "start-on" logic I can add

> to the upstart script to work around this.

Hmm, this is beyond my upstart-fu, unfortunately.  This has come up

before, actually.  Previously we would wait for any interface to come up

and then start, but that broke with multi-nic machines, and I ended up

just making things start in runlevel [2345].

James, do you know what should be done to make the job wait for *all*

network interfaces to be up?  Is that even the right solution here?

sage

>

>

> On Fri, Aug 23, 2013 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:

>       Hi Travis,

>

>       On Fri, 23 Aug 2013, Travis Rhoden wrote:

>       > Hey folks,

>       >

>       > I've just done a brand new install of 0.67.2 on a cluster of

>       Calxeda nodes.

>       >

>       > I have one particular monitor that number joins the quorum

>       when I restart

>       > the node.  Looks to  me like it has something to do with the

>       "create-keys"

>       > task, which never seems to finish:

>       >

>       > root      1240     1  4 13:03 ?        00:00:02

>       /usr/bin/ceph-mon

>       > --cluster=ceph -i n24 -f

>       > root      1244     1  0 13:03 ?        00:00:00

>       /usr/bin/python

>       > /usr/sbin/ceph-create-keys --cluster=ceph -i n24

>       >

>       > I don't see that task on my other monitors.  Additionally,

>       that task is

>       > periodically query the monitor status:

>       >

>       > root      1240     1  2 13:03 ?        00:00:02

>       /usr/bin/ceph-mon

>       > --cluster=ceph -i n24 -f

>       > root      1244     1  0 13:03 ?        00:00:00

>       /usr/bin/python

>       > /usr/sbin/ceph-create-keys --cluster=ceph -i n24

>       > root      1982  1244 15 13:04 ?        00:00:00

>       /usr/bin/python

>       > /usr/bin/ceph --cluster=ceph

>       --admin-daemon=/var/run/ceph/ceph-mon.n24.asok

>       > mon_status

>       >

>       > Checking that status myself, I see:

>       >

>       > # ceph --cluster=ceph

>       --admin-daemon=/var/run/ceph/ceph-mon.n24.asok

>       > mon_status

>       > { "name": "n24",

>       >   "rank": 2,

>       >   "state": "probing",

>       >   "election_epoch": 0,

>       >   "quorum": [],

>       >   "outside_quorum": [

>       >         "n24"],

>       >   "extra_probe_peers": [],

>       >   "sync_provider": [],

>       >   "monmap": { "epoch": 2,

>       >       "fsid": "f0b0d4ec-1ac3-4b24-9eab-c19760ce4682",

>       >       "modified": "2013-08-23 12:55:34.374650",

>       >       "created": "0.000000",

>       >       "mons": [

>       >             { "rank": 0,

>       >               "name": "n0",

>       >               "addr": "10.0.1.0:6789\/0"},

>       >             { "rank": 1,

>       >               "name": "n12",

>       >               "addr": "10.0.1.12:6789\/0"},

>       >             { "rank": 2,

>       >               "name": "n24",

>       >               "addr": "0.0.0.0:6810\/0"}]}}

>                         ^^^^^^^^^^^^^^^^^^^^

>

> This is the problem.  I can't remember exactly what causes this,

> though.

> Can you verify the host in ceph.conf mon_host line matches the ip that

> is

> configured on th machine, and that the /etc/hsots on the machine

> doesn't

> have a loopback address on it.

>

> Thanks!

> sage

>

>

>

>

> >

> > Any ideas what is going on here?  I don't see anything useful in

> > /var/log/ceph/ceph-mon.n24.log

> >

> >  Thanks,

> >

> >  - Travis

> >

> >

>

>

>

> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com