Re: ceph-create-keys hung

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/10/13 10:58, Abhay Sachan wrote:
Hi Joao,
When I run the following command on the node "ceph --admin-daemon
/var/run/ceph/ceph-mon.dec.asok mon_status"
I get the following output:

{ "name": "dec",
   "rank": 0,
   "state": "probing",
   "election_epoch": 0,
   "quorum": [],
   "outside_quorum": [
         "dec"],
   "extra_probe_peers": [
         "15.213.24.230:6789 <http://15.213.24.230:6789>\/0",
         "15.213.24.231:6789 <http://15.213.24.231:6789>\/0"],
   "sync_provider": [],
   "monmap": { "epoch": 0,
       "fsid": "166ba014-9b5b-4cea-8403-ecdb6e0a1763",
       "modified": "0.000000",
       "created": "0.000000",
       "mons": [
             { "rank": 0,
               "name": "dec",
               "addr": "15.213.24.241:6789 <http://15.213.24.241:6789>\/0"},
             { "rank": 1,
               "name": "jul",
               "addr": "0.0.0.0:0 <http://0.0.0.0:0>\/1"},
             { "rank": 2,
               "name": "julilo",
               "addr": "0.0.0.0:0 <http://0.0.0.0:0>\/2"}]}
}

And the addr for the other two nodes is "0.0.0.0" in monitor list. Any
ideas on this??

Abhay,

Your monitor are not connecting to each other thus not forming quorum. That's why your monmap appears to be all sketchy: the monitors haven't found each other so they haven't filled the That can be due to a number of issues, but my guess would be those weird addresses you have defined on your monmap, which don't appear much like valid ip:port addresses, as it looks like you have 'http://foo' in there.

Anyway, the reason why nothing appears to work is that your monitors are unable to speak to each other. Try cranking up debug levels on the monitors (I'd suggest 'debug mon = 10', 'debug ms = 1' and 'debug auth = 10') and look for connectivity or keyring issues.

I'd also advise you to generate a new monmap and inject it on all three monitors, or redeploy the monitors following the docs.

  -Joao



On Tue, Oct 8, 2013 at 3:15 PM, Abhay Sachan <abhaysac@xxxxxxxxx
<mailto:abhaysac@xxxxxxxxx>> wrote:

    Hi Joao,
    I gave the path for the keyring, and these messages are being
    printed on the screen:

    2013-10-07 15:18:55.048151 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400dfe0).fault
    2013-10-07 15:18:58.048774 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4001cb0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4002290).fault
    2013-10-07 15:19:01.049193 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40038b0).fault
    2013-10-07 15:19:04.049690 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4002590).fault
    2013-10-07 15:19:07.050098 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40038b0).fault
    2013-10-07 15:19:10.050467 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4005710).fault
    2013-10-07 15:19:13.051033 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4000990 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4005870).fault
    2013-10-07 15:19:16.051257 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40042e0).fault
    2013-10-07 15:19:19.051823 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4000990 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4004990).fault
    2013-10-07 15:19:22.052177 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40042e0).fault
    2013-10-07 15:19:25.052526 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400f4a0).fault
    2013-10-07 15:19:28.053000 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4007d90 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4007250).fault
    2013-10-07 15:19:31.053452 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400f4a0).fault
    2013-10-07 15:19:34.054070 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4001cb0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4007d90).fault
    2013-10-07 15:19:37.054303 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400a7e0).fault
    2013-10-07 15:19:40.054724 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400a1b0).fault
    2013-10-07 15:19:43.055099 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4009610).fault
    2013-10-07 15:19:46.055462 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4001cb0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4007d90).fault
    2013-10-07 15:19:49.056004 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400c5a0).fault
    2013-10-07 15:19:52.056492 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400b000).fault
    2013-10-07 15:19:55.056740 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40111b0).fault
    2013-10-07 15:19:58.057117 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
    pipe(0x7fa1b4009110 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400b4a0).fault
    2013-10-07 15:20:01.057596 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b400bc00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400be60).fault
    2013-10-07 15:20:04.058191 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b400d170 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4015300).fault
    2013-10-07 15:20:07.058837 7fa1c43bc700  0 -- :/1026989 >>
    15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
    pipe(0x7fa1b400f640 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400bc00).fault
    2013-10-07 15:20:10.059147 7fa1c42bb700  0 -- :/1026989 >>
    15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
    pipe(0x7fa1b400d170 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40015c0).fault


    Regards,
    Abhay




    On Tue, Oct 8, 2013 at 2:44 PM, Joao Eduardo Luis
    <joao.luis@xxxxxxxxxxx <mailto:joao.luis@xxxxxxxxxxx>> wrote:

        On 08/10/13 05:40, Abhay Sachan wrote:

            Hi Joao,
            Thanks for replying. All of my monitors are up and running
            and connected
            to each other. "ceph -s" is failing on the cluster with
            following error:

            2013-10-07 10:12:25.099261 7fd1b948d700 -1
            monclient(hunting): ERROR:
            missing keyring, cannot use cephx for authentication
            2013-10-07 10:12:25.099271 7fd1b948d700  0 librados:
            client.admin
            initialization error (2) No such file or directory
            Error connecting to cluster: ObjectNotFound


        This says it all.  You're somehow missing your keyring.  Or
        maybe you're not but you're keeping it in a non-default location
        and you're not specifying a path to it either on your ceph.conf
        or via '--keyring <path>'.



            And the logs on each monitor has lots of entries like this:
            NODE 1:

            2013-10-07 03:58:51.153847 7ff2864c6700  0
            mon.jul@0(probing).data___health(0) update_stats avail 76%
            total 42332700
            used 7901820 avail 32280480
            2013-10-07 03:59:51.154051 7ff2864c6700  0
            mon.jul@0(probing).data___health(0) update_stats avail 76%
            total 42332700
            used 7901832 avail 32280468
            2013-10-07 04:00:51.154256 7ff2864c6700  0
            mon.jul@0(probing).data___health(0) update_stats avail 76%
            total 42332700
            used 7901828 avail 32280472


        Those messages are simply reports on the mon store's storage
        capacity.


           -Joao



            Regards,
            Abhay

            On Thu, Oct 3, 2013 at 8:31 PM, Joao Eduardo Luis
            <joao.luis@xxxxxxxxxxx <mailto:joao.luis@xxxxxxxxxxx>
            <mailto:joao.luis@xxxxxxxxxxx
            <mailto:joao.luis@xxxxxxxxxxx>>__> wrote:

                 On 10/03/2013 02:44 PM, Abhay Sachan wrote:

                     Hi All,
                     I have tried setting up a ceph cluster with 3 nodes (3
                     monitors). I am
                     using RHEL 6.4 as OS with dumpling(0.67.3) release.
            Ceph cluster
                     creation (using ceph-deploy as well as mkcephfs),
            ceph-creates-keys
                     doesn't return on any of the servers. Whereas, if I
            create a cluster
                     with only 1 node (1 monitor), key creation goes
            through. Has anybody
                     seen this problem or any ideas what I might be
            missing??

                     Regards,
                     Abhay


                 Those symptoms tell me that your monitors are not
            forming quorum.
                 'ceph-create-keys' needs the monitors to first
            establish a quorum,
                 otherwise it will hang waiting for that to happen.

                 Please make sure all your monitors are running.  If so,
            try running
                 'ceph -s' on your cluster.  If that hangs as well, try
            accessing
                 each monitor's admin socket to check what's happening
            [1].  If that
                 too fails, try looking into the logs for something
            obviously wrong.
                   If you are not able to discern anything useful at
            that point,
                 upload the logs to some place and point us to them --
            we'll then be
                 happy to take a look.

                 Hope this helps.

                    -Joao

                 --
                 Joao Eduardo Luis
                 Software Engineer | http://inktank.com | http://ceph.com




        --
        Joao Eduardo Luis
        Software Engineer | http://inktank.com | http://ceph.com





--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux