Re: Is mon initial members used after the first quorum?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/11/2014 04:28 AM, Christopher Armstrong wrote:
If someone could point me to where this fix should go in the code, I'd
actually love to dive in - I've been wanting to contribute back to Ceph,
and this bug has hit us personally so I think it's a good candidate :)

I'm not sure where the bug is or what it may be (see reply to Christian's email sent a few minutes ago).

I believe the first step to assess what's happening is to reliably reproduce this. Ideally in a different environment, or such that it makes it clear it's not an issue specific to your deployment.

Next, say it's indeed the config file that is being misread: you'd probably want to look into common/config.{cc,h}. If it happens to be a bug while building a monmap, you'd want to look into mon/MonMap.cc and mon/MonClient.cc. Being a radosgw issue, you'll probably want to look into 'rgw/*' and/or 'librados/*', but maybe someone else could give you the pointers for those.

I think the main task now is to reliably reproduce this. I haven't been able to from the config you provided, but I may have made some assumptions that end up negating the whole bug.

Cheers!

  -Joao


On Wed, Dec 10, 2014 at 8:25 PM, Christopher Armstrong
<chris@xxxxxxxxxxxx <mailto:chris@xxxxxxxxxxxx>> wrote:

    We're running Ceph entirely in Docker containers, so we couldn't use
    ceph-deploy due to the requirement of having a process management
    daemon (upstart, in Ubuntu's case). So, I wrote things out and
    templated them myself following the documentation.

    Thanks for linking the bug, Christian! You saved us a lot of time
    and troubleshooting. I'll post a comment on the bug.

    Chris

    On Wed, Dec 10, 2014 at 8:18 PM, Christian Balzer <chibi@xxxxxxx
    <mailto:chibi@xxxxxxx>> wrote:

        On Wed, 10 Dec 2014 20:09:01 -0800 Christopher Armstrong wrote:

        > Christian,
        >
        > That indeed looks like the bug! We tried with moving the monitor
        > host/address into global and everything works as expected - see
        >https://github.com/deis/deis/issues/2711#issuecomment-66566318
        >
        > This seems like a potentially bad bug - how has it not come up before?

        Ah, but as you can see from the issue report is has come up before.
        But that discussion as well as that report clearly fell through
        the cracks.

        It's another reason I dislike ceph-deploy, as people using just it
        (probably the vast majority) will be unaffected as it stuffs
        everything
        into [global].

        People reading the documentation examples or coming from older
        versions
        (and making changes to their config) will get bitten.

        Christian

         > Anything we can do to help with a patch?
         >
         > Chris
         >
         > On Wed, Dec 10, 2014 at 5:14 PM, Christian Balzer
        <chibi@xxxxxxx <mailto:chibi@xxxxxxx>> wrote:
         >
         > >
         > > Hello,
         > >
         > > I think this might very well be my poor, unacknowledged bug
        report:
         > > http://tracker.ceph.com/issues/10012
         > >
         > > People with a mon_hosts entry in [global] (as created by
        ceph-deploy)
         > > will be fine, people with mons specified outside of
        [global] will not.
         > >
         > > Regards,
         > >
         > > Christian
         > >
         > > On Thu, 11 Dec 2014 00:49:03 +0000 Joao Eduardo Luis wrote:
         > >
         > > > On 12/10/2014 09:05 PM, Gregory Farnum wrote:
         > > > > What version is he running?
         > > > >
         > > > > Joao, does this make any sense to you?
         > > >
         > > >  From the MonMap code I'm pretty sure that the client
        should have
         > > > built the monmap from the [mon.X] sections, and solely
        based on 'mon
         > > > addr'.
         > > >
         > > > 'mon_initial_members' is only useful to the monitors
        anyway, so it
         > > > can be disregarded.
         > > >
         > > > Thus, there are two ways for a client to build a monmap:
         > > > 1) based on 'mon_hosts' on the config (or -m on cli); or
         > > > 2) based on 'mon addr = ip1,ip2...' from the [mon.X] sections
         > > >
         > > > I don't see a 'mon hosts = ip1,ip2,...' on the config
        file, and I'm
         > > > assuming a '-m ip1,ip2...' has been supplied on the cli,
        so we would
         > > > have been left with the 'mon addr' options on each
        individual [mon.X]
         > > > section.
         > > >
         > > > We are left with two options here: assume there was
        unexpected
         > > > behavior on this code path -- logs or steps to reproduce
        would be
         > > > appreciated in this case! -- or assume something else failed:
         > > >
         > > > - are the ips on the remaining mon sections correct
        (nodo-1 &&
         > > > nodo-2)?
         > > > - were all the remaining monitors up and running when the
        failure
         > > > occurred?
         > > > - were the remaining monitors reachable by the client?
         > > >
         > > > In case you are able to reproduce this behavior, would be
        nice if you
         > > > could provide logs with 'debug monc = 10' and 'debug ms = 1'.
         > > >
         > > > Cheers!
         > > >
         > > >    -Joao
         > > >
         > > >
         > > > > -Greg
         > > > >
         > > > > On Wed, Dec 10, 2014 at 11:54 AM, Christopher Armstrong
         > > > > <chris@xxxxxxxxxxxx <mailto:chris@xxxxxxxxxxxx>> wrote:
         > > > >> Thanks Greg - I thought the same thing, but confirmed
        with the
         > > > >> user that it appears the radosgw client is indeed
        using initial
         > > > >> members - when he added all of his hosts to initial
        members,
         > > > >> things worked just fine. In either event, all of the
        monitors
         > > > >> were always fully enumerated later in the config file.
        Is this
         > > > >> potentially a bug specific to radosgw? Here's his
        config file:
         > > > >>
         > > > >> [global]
         > > > >> fsid = fc0e2e09-ade3-4ff6-b23e-f789775b2515
         > > > >> mon initial members = nodo-3
         > > > >> auth cluster required = cephx
         > > > >> auth service required = cephx
         > > > >> auth client required = cephx
         > > > >> osd pool default size = 3
         > > > >> osd pool default min_size = 1
         > > > >> osd pool default pg_num = 128
         > > > >> osd pool default pgp_num = 128
         > > > >> osd recovery delay start = 15
         > > > >> log file = /dev/stdout
         > > > >> mon_clock_drift_allowed = 1
         > > > >>
         > > > >>
         > > > >> [mon.nodo-1]
         > > > >> host = nodo-1
         > > > >> mon addr = 192.168.2.200:6789 <http://192.168.2.200:6789>
         > > > >>
         > > > >> [mon.nodo-2]
         > > > >> host = nodo-2
         > > > >> mon addr = 192.168.2.201:6789 <http://192.168.2.201:6789>
         > > > >>
         > > > >> [mon.nodo-3]
         > > > >> host = nodo-3
         > > > >> mon addr = 192.168.2.202:6789 <http://192.168.2.202:6789>
         > > > >>
         > > > >>
         > > > >>
         > > > >> [client.radosgw.gateway]
         > > > >> host = deis-store-gateway
         > > > >> keyring = /etc/ceph/ceph.client.radosgw.keyring
         > > > >> rgw socket path =
        /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
         > > > >> log file = /dev/stdout
         > > > >>
         > > > >>
         > > > >> On Wed, Dec 10, 2014 at 11:40 AM, Gregory Farnum
         > > > >> <greg@xxxxxxxxxxx <mailto:greg@xxxxxxxxxxx>> wrote:
         > > > >>>
         > > > >>> On Tue, Dec 9, 2014 at 3:11 PM, Christopher Armstrong
         > > > >>> <chris@xxxxxxxxxxxx <mailto:chris@xxxxxxxxxxxx>> wrote:
         > > > >>>> Hi folks,
         > > > >>>>
         > > > >>>> I think we have a bit of confusion around how
        initial members is
         > > > >>>> used. I understand that we can specify a single
        monitor (or a
         > > > >>>> subset of monitors) so
         > > > >>>> that the cluster can form a quorum when it first
        comes up. This
         > > > >>>> is how we're
         > > > >>>> using the setting now - so the cluster can come up
        with just one
         > > > >>>> monitor,
         > > > >>>> with the other monitors to follow later.
         > > > >>>>
         > > > >>>> However, a Deis user reported that when the monitor
        in his
         > > > >>>> initial members
         > > > >>>> list went down, radosgw stopped functioning, even
        though there
         > > > >>>> are three mons in his config file. I would think
        that the
         > > > >>>> radosgw client would connect
         > > > >>>> to any of the nodes in the config file to get the
        state of the
         > > > >>>> cluster, and
         > > > >>>> that the initial members list is only used when the
        monitors
         > > > >>>> first come up
         > > > >>>> and are trying to achieve quorum.
         > > > >>>>
         > > > >>>> The issue he filed is here:
         > > https://github.com/deis/deis/issues/2711
         > > > >>>>
         > > > >>>> He also found this Ceph issue filed:
         > > > >>>> https://github.com/ceph/ceph/pull/1233
         > > > >>>
         > > > >>> Nope, this has nothing to do with it.
         > > > >>>
         > > > >>>>
         > > > >>>> Is that what we're seeing here? Can anyone point us
        in the right
         > > > >>>> direction?
         > > > >>>
         > > > >>> I didn't see the actual conf file posted anywhere to
        look at,
         > > > >>> but my guess is simply that (since it looks like
        you're using
         > > > >>> generated conf files which can differ across hosts)
        that the one
         > > > >>> on the server(s) in question don't have the monitors
        listed in
         > > > >>> them. I'm only skimming the code, but from it and my
         > > > >>> recollection, when a Ceph client starts up it will try to
         > > > >>> assemble a list of monitors to contact from: 1) the
        contents of
         > > > >>> the "mon host" config entry 2) the "mon addr" value
        in any of
         > > > >>> the "global", "mon" or "mon.X" sections
         > > > >>>
         > > > >>> The clients don't even look at mon_initial_members
        that I can
         > > > >>> see, actually — so perhaps your client config only
        lists the
         > > > >>> initial monitor, without adding the others?
         > > > >>> -Greg
         > > > >>
         > > > >>
         > > > > _______________________________________________
         > > > > ceph-users mailing list
         > > > > ceph-users@xxxxxxxxxxxxxx
        <mailto:ceph-users@xxxxxxxxxxxxxx>
         > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
         > > > >
         > > >
         > > >
         > >
         > >
         > > --
         > > Christian Balzer        Network/Systems Engineer
         > > chibi@xxxxxxx <mailto:chibi@xxxxxxx>           Global
        OnLine Japan/Fusion Communications
         > > http://www.gol.com/
         > >


        --
        Christian Balzer        Network/Systems Engineer
        chibi@xxxxxxx <mailto:chibi@xxxxxxx>           Global OnLine
        Japan/Fusion Communications
        http://www.gol.com/





--
Joao Eduardo Luis
Software Engineer | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux