Re: ceph-create-keys hung

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Abhay

I had a similar mon<->mon communication problem using ceph-deploy which was down to iptables rules.  Depending on what OS you are running, by default the ports Ceph uses may be blocked.  As per http://ceph.com/docs/master/rados/configuration/network-config-ref/ you need to open ports 6789 & 6800-7100.

Hope this helps.

Calum Loudon
Director of Architecture
Metaswitch Networks
 
P   +44 (0)208 366 1177
E   calum.loudon@xxxxxxxxxxxxxx


-----Original Message-----
From: ceph-users-bounces@xxxxxxxxxxxxxx [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Joao Eduardo Luis
Sent: 08 October 2013 13:20
To: Abhay Sachan
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  ceph-create-keys hung

On 08/10/13 10:58, Abhay Sachan wrote:
> Hi Joao,
> When I run the following command on the node "ceph --admin-daemon 
> /var/run/ceph/ceph-mon.dec.asok mon_status"
> I get the following output:
>
> { "name": "dec",
>    "rank": 0,
>    "state": "probing",
>    "election_epoch": 0,
>    "quorum": [],
>    "outside_quorum": [
>          "dec"],
>    "extra_probe_peers": [
>          "15.213.24.230:6789 <http://15.213.24.230:6789>\/0",
>          "15.213.24.231:6789 <http://15.213.24.231:6789>\/0"],
>    "sync_provider": [],
>    "monmap": { "epoch": 0,
>        "fsid": "166ba014-9b5b-4cea-8403-ecdb6e0a1763",
>        "modified": "0.000000",
>        "created": "0.000000",
>        "mons": [
>              { "rank": 0,
>                "name": "dec",
>                "addr": "15.213.24.241:6789 <http://15.213.24.241:6789>\/0"},
>              { "rank": 1,
>                "name": "jul",
>                "addr": "0.0.0.0:0 <http://0.0.0.0:0>\/1"},
>              { "rank": 2,
>                "name": "julilo",
>                "addr": "0.0.0.0:0 <http://0.0.0.0:0>\/2"}]} }
>
> And the addr for the other two nodes is "0.0.0.0" in monitor list. Any 
> ideas on this??

Abhay,

Your monitor are not connecting to each other thus not forming quorum. 
That's why your monmap appears to be all sketchy: the monitors haven't 
found each other so they haven't filled the   That can be due to a 
number of issues, but my guess would be those weird addresses you have defined on your monmap, which don't appear much like valid ip:port addresses, as it looks like you have 'http://foo' in there.

Anyway, the reason why nothing appears to work is that your monitors are unable to speak to each other.  Try cranking up debug levels on the monitors (I'd suggest 'debug mon = 10', 'debug ms = 1' and 'debug auth =
10') and look for connectivity or keyring issues.

I'd also advise you to generate a new monmap and inject it on all three monitors, or redeploy the monitors following the docs.

   -Joao


>
> On Tue, Oct 8, 2013 at 3:15 PM, Abhay Sachan <abhaysac@xxxxxxxxx 
> <mailto:abhaysac@xxxxxxxxx>> wrote:
>
>     Hi Joao,
>     I gave the path for the keyring, and these messages are being
>     printed on the screen:
>
>     2013-10-07 15:18:55.048151 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400dfe0).fault
>     2013-10-07 15:18:58.048774 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4001cb0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4002290).fault
>     2013-10-07 15:19:01.049193 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40038b0).fault
>     2013-10-07 15:19:04.049690 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4002590).fault
>     2013-10-07 15:19:07.050098 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40038b0).fault
>     2013-10-07 15:19:10.050467 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4005710).fault
>     2013-10-07 15:19:13.051033 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4000990 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4005870).fault
>     2013-10-07 15:19:16.051257 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40042e0).fault
>     2013-10-07 15:19:19.051823 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4000990 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4004990).fault
>     2013-10-07 15:19:22.052177 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40042e0).fault
>     2013-10-07 15:19:25.052526 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400f4a0).fault
>     2013-10-07 15:19:28.053000 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4007d90 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4007250).fault
>     2013-10-07 15:19:31.053452 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400f4a0).fault
>     2013-10-07 15:19:34.054070 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4001cb0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4007d90).fault
>     2013-10-07 15:19:37.054303 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400a7e0).fault
>     2013-10-07 15:19:40.054724 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400a1b0).fault
>     2013-10-07 15:19:43.055099 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4009610).fault
>     2013-10-07 15:19:46.055462 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4001cb0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4007d90).fault
>     2013-10-07 15:19:49.056004 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400c5a0).fault
>     2013-10-07 15:19:52.056492 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b4001cb0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400b000).fault
>     2013-10-07 15:19:55.056740 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b4000990 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b40111b0).fault
>     2013-10-07 15:19:58.057117 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.241:6789/0 <http://15.213.24.241:6789/0>
>     pipe(0x7fa1b4009110 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400b4a0).fault
>     2013-10-07 15:20:01.057596 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b400bc00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400be60).fault
>     2013-10-07 15:20:04.058191 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b400d170 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b4015300).fault
>     2013-10-07 15:20:07.058837 7fa1c43bc700  0 -- :/1026989 >>
>     15.213.24.230:6789/0 <http://15.213.24.230:6789/0>
>     pipe(0x7fa1b400f640 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa1b400bc00).fault
>     2013-10-07 15:20:10.059147 7fa1c42bb700  0 -- :/1026989 >>
>     15.213.24.231:6789/0 <http://15.213.24.231:6789/0>
>     pipe(0x7fa1b400d170 sd=4 :0 s=1 pgs=0 cs=0 l=1 
> c=0x7fa1b40015c0).fault
>
>
>     Regards,
>     Abhay
>
>
>
>
>     On Tue, Oct 8, 2013 at 2:44 PM, Joao Eduardo Luis
>     <joao.luis@xxxxxxxxxxx <mailto:joao.luis@xxxxxxxxxxx>> wrote:
>
>         On 08/10/13 05:40, Abhay Sachan wrote:
>
>             Hi Joao,
>             Thanks for replying. All of my monitors are up and running
>             and connected
>             to each other. "ceph -s" is failing on the cluster with
>             following error:
>
>             2013-10-07 10:12:25.099261 7fd1b948d700 -1
>             monclient(hunting): ERROR:
>             missing keyring, cannot use cephx for authentication
>             2013-10-07 10:12:25.099271 7fd1b948d700  0 librados:
>             client.admin
>             initialization error (2) No such file or directory
>             Error connecting to cluster: ObjectNotFound
>
>
>         This says it all.  You're somehow missing your keyring.  Or
>         maybe you're not but you're keeping it in a non-default location
>         and you're not specifying a path to it either on your ceph.conf
>         or via '--keyring <path>'.
>
>
>
>             And the logs on each monitor has lots of entries like this:
>             NODE 1:
>
>             2013-10-07 03:58:51.153847 7ff2864c6700  0
>             mon.jul@0(probing).data___health(0) update_stats avail 76%
>             total 42332700
>             used 7901820 avail 32280480
>             2013-10-07 03:59:51.154051 7ff2864c6700  0
>             mon.jul@0(probing).data___health(0) update_stats avail 76%
>             total 42332700
>             used 7901832 avail 32280468
>             2013-10-07 04:00:51.154256 7ff2864c6700  0
>             mon.jul@0(probing).data___health(0) update_stats avail 76%
>             total 42332700
>             used 7901828 avail 32280472
>
>
>         Those messages are simply reports on the mon store's storage
>         capacity.
>
>
>            -Joao
>
>
>
>             Regards,
>             Abhay
>
>             On Thu, Oct 3, 2013 at 8:31 PM, Joao Eduardo Luis
>             <joao.luis@xxxxxxxxxxx <mailto:joao.luis@xxxxxxxxxxx>
>             <mailto:joao.luis@xxxxxxxxxxx
>             <mailto:joao.luis@xxxxxxxxxxx>>__> wrote:
>
>                  On 10/03/2013 02:44 PM, Abhay Sachan wrote:
>
>                      Hi All,
>                      I have tried setting up a ceph cluster with 3 nodes (3
>                      monitors). I am
>                      using RHEL 6.4 as OS with dumpling(0.67.3) release.
>             Ceph cluster
>                      creation (using ceph-deploy as well as mkcephfs),
>             ceph-creates-keys
>                      doesn't return on any of the servers. Whereas, if I
>             create a cluster
>                      with only 1 node (1 monitor), key creation goes
>             through. Has anybody
>                      seen this problem or any ideas what I might be
>             missing??
>
>                      Regards,
>                      Abhay
>
>
>                  Those symptoms tell me that your monitors are not
>             forming quorum.
>                  'ceph-create-keys' needs the monitors to first
>             establish a quorum,
>                  otherwise it will hang waiting for that to happen.
>
>                  Please make sure all your monitors are running.  If so,
>             try running
>                  'ceph -s' on your cluster.  If that hangs as well, try
>             accessing
>                  each monitor's admin socket to check what's happening
>             [1].  If that
>                  too fails, try looking into the logs for something
>             obviously wrong.
>                    If you are not able to discern anything useful at
>             that point,
>                  upload the logs to some place and point us to them --
>             we'll then be
>                  happy to take a look.
>
>                  Hope this helps.
>
>                     -Joao
>
>                  --
>                  Joao Eduardo Luis
>                  Software Engineer | http://inktank.com | 
> http://ceph.com
>
>
>
>
>         --
>         Joao Eduardo Luis
>         Software Engineer | http://inktank.com | http://ceph.com
>
>
>


--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux