Re: Is mon initial members used after the first quorum?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If someone could point me to where this fix should go in the code, I'd actually love to dive in - I've been wanting to contribute back to Ceph, and this bug has hit us personally so I think it's a good candidate :)

On Wed, Dec 10, 2014 at 8:25 PM, Christopher Armstrong <chris@xxxxxxxxxxxx> wrote:
We're running Ceph entirely in Docker containers, so we couldn't use ceph-deploy due to the requirement of having a process management daemon (upstart, in Ubuntu's case). So, I wrote things out and templated them myself following the documentation.

Thanks for linking the bug, Christian! You saved us a lot of time and troubleshooting. I'll post a comment on the bug.

Chris

On Wed, Dec 10, 2014 at 8:18 PM, Christian Balzer <chibi@xxxxxxx> wrote:
On Wed, 10 Dec 2014 20:09:01 -0800 Christopher Armstrong wrote:

> Christian,
>
> That indeed looks like the bug! We tried with moving the monitor
> host/address into global and everything works as expected - see
> https://github.com/deis/deis/issues/2711#issuecomment-66566318
>
> This seems like a potentially bad bug - how has it not come up before?

Ah, but as you can see from the issue report is has come up before.
But that discussion as well as that report clearly fell through the cracks.

It's another reason I dislike ceph-deploy, as people using just it
(probably the vast majority) will be unaffected as it stuffs everything
into [global].

People reading the documentation examples or coming from older versions
(and making changes to their config) will get bitten.

Christian

> Anything we can do to help with a patch?
>
> Chris
>
> On Wed, Dec 10, 2014 at 5:14 PM, Christian Balzer <chibi@xxxxxxx> wrote:
>
> >
> > Hello,
> >
> > I think this might very well be my poor, unacknowledged bug report:
> > http://tracker.ceph.com/issues/10012
> >
> > People with a mon_hosts entry in [global] (as created by ceph-deploy)
> > will be fine, people with mons specified outside of [global] will not.
> >
> > Regards,
> >
> > Christian
> >
> > On Thu, 11 Dec 2014 00:49:03 +0000 Joao Eduardo Luis wrote:
> >
> > > On 12/10/2014 09:05 PM, Gregory Farnum wrote:
> > > > What version is he running?
> > > >
> > > > Joao, does this make any sense to you?
> > >
> > >  From the MonMap code I'm pretty sure that the client should have
> > > built the monmap from the [mon.X] sections, and solely based on 'mon
> > > addr'.
> > >
> > > 'mon_initial_members' is only useful to the monitors anyway, so it
> > > can be disregarded.
> > >
> > > Thus, there are two ways for a client to build a monmap:
> > > 1) based on 'mon_hosts' on the config (or -m on cli); or
> > > 2) based on 'mon addr = ip1,ip2...' from the [mon.X] sections
> > >
> > > I don't see a 'mon hosts = ip1,ip2,...' on the config file, and I'm
> > > assuming a '-m ip1,ip2...' has been supplied on the cli, so we would
> > > have been left with the 'mon addr' options on each individual [mon.X]
> > > section.
> > >
> > > We are left with two options here: assume there was unexpected
> > > behavior on this code path -- logs or steps to reproduce would be
> > > appreciated in this case! -- or assume something else failed:
> > >
> > > - are the ips on the remaining mon sections correct (nodo-1 &&
> > > nodo-2)?
> > > - were all the remaining monitors up and running when the failure
> > > occurred?
> > > - were the remaining monitors reachable by the client?
> > >
> > > In case you are able to reproduce this behavior, would be nice if you
> > > could provide logs with 'debug monc = 10' and 'debug ms = 1'.
> > >
> > > Cheers!
> > >
> > >    -Joao
> > >
> > >
> > > > -Greg
> > > >
> > > > On Wed, Dec 10, 2014 at 11:54 AM, Christopher Armstrong
> > > > <chris@xxxxxxxxxxxx> wrote:
> > > >> Thanks Greg - I thought the same thing, but confirmed with the
> > > >> user that it appears the radosgw client is indeed using initial
> > > >> members - when he added all of his hosts to initial members,
> > > >> things worked just fine. In either event, all of the monitors
> > > >> were always fully enumerated later in the config file. Is this
> > > >> potentially a bug specific to radosgw? Here's his config file:
> > > >>
> > > >> [global]
> > > >> fsid = fc0e2e09-ade3-4ff6-b23e-f789775b2515
> > > >> mon initial members = nodo-3
> > > >> auth cluster required = cephx
> > > >> auth service required = cephx
> > > >> auth client required = cephx
> > > >> osd pool default size = 3
> > > >> osd pool default min_size = 1
> > > >> osd pool default pg_num = 128
> > > >> osd pool default pgp_num = 128
> > > >> osd recovery delay start = 15
> > > >> log file = /dev/stdout
> > > >> mon_clock_drift_allowed = 1
> > > >>
> > > >>
> > > >> [mon.nodo-1]
> > > >> host = nodo-1
> > > >> mon addr = 192.168.2.200:6789
> > > >>
> > > >> [mon.nodo-2]
> > > >> host = nodo-2
> > > >> mon addr = 192.168.2.201:6789
> > > >>
> > > >> [mon.nodo-3]
> > > >> host = nodo-3
> > > >> mon addr = 192.168.2.202:6789
> > > >>
> > > >>
> > > >>
> > > >> [client.radosgw.gateway]
> > > >> host = deis-store-gateway
> > > >> keyring = /etc/ceph/ceph.client.radosgw.keyring
> > > >> rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
> > > >> log file = /dev/stdout
> > > >>
> > > >>
> > > >> On Wed, Dec 10, 2014 at 11:40 AM, Gregory Farnum
> > > >> <greg@xxxxxxxxxxx> wrote:
> > > >>>
> > > >>> On Tue, Dec 9, 2014 at 3:11 PM, Christopher Armstrong
> > > >>> <chris@xxxxxxxxxxxx> wrote:
> > > >>>> Hi folks,
> > > >>>>
> > > >>>> I think we have a bit of confusion around how initial members is
> > > >>>> used. I understand that we can specify a single monitor (or a
> > > >>>> subset of monitors) so
> > > >>>> that the cluster can form a quorum when it first comes up. This
> > > >>>> is how we're
> > > >>>> using the setting now - so the cluster can come up with just one
> > > >>>> monitor,
> > > >>>> with the other monitors to follow later.
> > > >>>>
> > > >>>> However, a Deis user reported that when the monitor in his
> > > >>>> initial members
> > > >>>> list went down, radosgw stopped functioning, even though there
> > > >>>> are three mons in his config file. I would think that the
> > > >>>> radosgw client would connect
> > > >>>> to any of the nodes in the config file to get the state of the
> > > >>>> cluster, and
> > > >>>> that the initial members list is only used when the monitors
> > > >>>> first come up
> > > >>>> and are trying to achieve quorum.
> > > >>>>
> > > >>>> The issue he filed is here:
> > https://github.com/deis/deis/issues/2711
> > > >>>>
> > > >>>> He also found this Ceph issue filed:
> > > >>>> https://github.com/ceph/ceph/pull/1233
> > > >>>
> > > >>> Nope, this has nothing to do with it.
> > > >>>
> > > >>>>
> > > >>>> Is that what we're seeing here? Can anyone point us in the right
> > > >>>> direction?
> > > >>>
> > > >>> I didn't see the actual conf file posted anywhere to look at,
> > > >>> but my guess is simply that (since it looks like you're using
> > > >>> generated conf files which can differ across hosts) that the one
> > > >>> on the server(s) in question don't have the monitors listed in
> > > >>> them. I'm only skimming the code, but from it and my
> > > >>> recollection, when a Ceph client starts up it will try to
> > > >>> assemble a list of monitors to contact from: 1) the contents of
> > > >>> the "mon host" config entry 2) the "mon addr" value in any of
> > > >>> the "global", "mon" or "mon.X" sections
> > > >>>
> > > >>> The clients don't even look at mon_initial_members that I can
> > > >>> see, actually — so perhaps your client config only lists the
> > > >>> initial monitor, without adding the others?
> > > >>> -Greg
> > > >>
> > > >>
> > > > _______________________________________________
> > > > ceph-users mailing list
> > > > ceph-users@xxxxxxxxxxxxxx
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > >
> > >
> >
> >
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi@xxxxxxx           Global OnLine Japan/Fusion Communications
> > http://www.gol.com/
> >


--
Christian Balzer        Network/Systems Engineer
chibi@xxxxxxx           Global OnLine Japan/Fusion Communications
http://www.gol.com/


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux