Re: mons not starting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> My monitors are suddenly not starting up properly, or at all. Using latest
> Debian release from ceph.com/debian-cuttlefish wheezy
> 
> One (mon.7 ip ending in .190) starts but says things like this in the logs:
> 1 mon.7@0(probing) e3 discarding message
> mon_subscribe({monmap=0+,osdmap=796}) and sending client elsewhere
> 1 mon.7@0(probing) e3 discarding message auth(proto 0 25 bytes epoch 0)
> v1 and sending client elsewhere
> 1 mon.7@0(probing) e3 discarding message auth(proto 0 34 bytes epoch 0)
> and sending client elsewhere
> 
> Another (mon.8 ip ending in .191) starts but says this in the logs:
>  0 -- x.x.x.191:6789/0 >> x.x.x.197:0/2400174543 pipe(0x92028c0 sd=660 :6789
> s=0 pgs=0 cs=0 l=0).accept peer addr is really x.x.x.197:0/2400174543 (socket
> is x.x.x.197:55263/0)
>  0 -- x.x.x.191:6789/0 >> x.x.x.197:0/2400174543 pipe(0x9200e00 sd=804 :6789
> s=0 pgs=0 cs=0 l=0).accept peer addr is really x.x.x.197:0/2400174543 (socket
> is x.x.x.197:55267/0)
> 0 -- x.x.x.191:6789/0 >> x.x.x.197:0/2400174543 pipe(0x9207dc0 sd=881 :6789
> s=0 pgs=0 cs=0 l=0).accept peer addr is really x.x.x.197:0/2400174543 (socket
> is x.x.x.197:55269/0)
> 
> And the last one (mon.4 ip ending in .197) won't start, with this in the logs:
> -1 obtain_monmap unable to find a monmap
>  0 mon.4 does not exist in monmap, will attempt to join an existing cluster
>  1 mon.4@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000
> -1 mon.4@-1(probing) e0 error: cluster_uuid file exists with value '167cc337-
> e3a3-4df0-8fe8-be84cce7f4f0', != our uuid 00000000-0000-0000-0000-
> 000000000000
> 
> Machine with ip ending in .197 is a client, the only one at the moment. The
> other two are osd's.
> 
> Previously, all 3 were working although one of them (normally, but not
> always mon.4) would be marked down whenever I wasn't looking...
> 
> Any hints?
> 
> Thanks
> 
> James

Hmmm... as soon as I hit send on this email, suddenly mon.7 and mon.8 came good... maybe they had to sort something out between them or something?

I'm still left with mon.4. Full logs of a failed start:

Starting Ceph mon.4 on machine4...
2013-05-31 17:37:14.381175 7f97a5c13780  0 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 686
2013-05-31 17:37:14.428562 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined
2013-05-31 17:37:15.117149 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined
2013-05-31 17:37:15.397401 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined
2013-05-31 17:37:15.442787 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined
2013-05-31 17:37:15.588285 7f97a5c13780 -1 obtain_monmap unable to find a monmap
2013-05-31 17:37:15.588363 7f97a5c13780  0 mon.4 does not exist in monmap, will attempt to join an existing cluster
2013-05-31 17:37:15.589063 7f97a5c13780  1 mon.4@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000
2013-05-31 17:37:15.589136 7f97a5c13780 -1 mon.4@-1(probing) e0 error: cluster_uuid file exists with value '167cc337-e3a3-4df0-8fe8-be84cce7f4f0', != our uuid 00000000-0000-0000-0000-000000000000
failed: 'ulimit -n 8192;  /usr/bin/ceph-mon -i 4 --pid-file /var/run/ceph/mon.4.pid -c /etc/ceph/ceph.conf '
Starting ceph-create-keys on machine4...

Should I just rebuild the mon or is there an easy fix?

Also why does ceph-create-keys get started over and over? I have 4 of them running now.

Thanks

James

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux