> > My monitors are suddenly not starting up properly, or at all. Using latest > Debian release from ceph.com/debian-cuttlefish wheezy > > One (mon.7 ip ending in .190) starts but says things like this in the logs: > 1 mon.7@0(probing) e3 discarding message > mon_subscribe({monmap=0+,osdmap=796}) and sending client elsewhere > 1 mon.7@0(probing) e3 discarding message auth(proto 0 25 bytes epoch 0) > v1 and sending client elsewhere > 1 mon.7@0(probing) e3 discarding message auth(proto 0 34 bytes epoch 0) > and sending client elsewhere > > Another (mon.8 ip ending in .191) starts but says this in the logs: > 0 -- x.x.x.191:6789/0 >> x.x.x.197:0/2400174543 pipe(0x92028c0 sd=660 :6789 > s=0 pgs=0 cs=0 l=0).accept peer addr is really x.x.x.197:0/2400174543 (socket > is x.x.x.197:55263/0) > 0 -- x.x.x.191:6789/0 >> x.x.x.197:0/2400174543 pipe(0x9200e00 sd=804 :6789 > s=0 pgs=0 cs=0 l=0).accept peer addr is really x.x.x.197:0/2400174543 (socket > is x.x.x.197:55267/0) > 0 -- x.x.x.191:6789/0 >> x.x.x.197:0/2400174543 pipe(0x9207dc0 sd=881 :6789 > s=0 pgs=0 cs=0 l=0).accept peer addr is really x.x.x.197:0/2400174543 (socket > is x.x.x.197:55269/0) > > And the last one (mon.4 ip ending in .197) won't start, with this in the logs: > -1 obtain_monmap unable to find a monmap > 0 mon.4 does not exist in monmap, will attempt to join an existing cluster > 1 mon.4@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000 > -1 mon.4@-1(probing) e0 error: cluster_uuid file exists with value '167cc337- > e3a3-4df0-8fe8-be84cce7f4f0', != our uuid 00000000-0000-0000-0000- > 000000000000 > > Machine with ip ending in .197 is a client, the only one at the moment. The > other two are osd's. > > Previously, all 3 were working although one of them (normally, but not > always mon.4) would be marked down whenever I wasn't looking... > > Any hints? > > Thanks > > James Hmmm... as soon as I hit send on this email, suddenly mon.7 and mon.8 came good... maybe they had to sort something out between them or something? I'm still left with mon.4. Full logs of a failed start: Starting Ceph mon.4 on machine4... 2013-05-31 17:37:14.381175 7f97a5c13780 0 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 686 2013-05-31 17:37:14.428562 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined 2013-05-31 17:37:15.117149 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined 2013-05-31 17:37:15.397401 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined 2013-05-31 17:37:15.442787 7f97a1ace700 -1 asok(0x1dd8000) AdminSocket: request 'mon_status' not defined 2013-05-31 17:37:15.588285 7f97a5c13780 -1 obtain_monmap unable to find a monmap 2013-05-31 17:37:15.588363 7f97a5c13780 0 mon.4 does not exist in monmap, will attempt to join an existing cluster 2013-05-31 17:37:15.589063 7f97a5c13780 1 mon.4@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000 2013-05-31 17:37:15.589136 7f97a5c13780 -1 mon.4@-1(probing) e0 error: cluster_uuid file exists with value '167cc337-e3a3-4df0-8fe8-be84cce7f4f0', != our uuid 00000000-0000-0000-0000-000000000000 failed: 'ulimit -n 8192; /usr/bin/ceph-mon -i 4 --pid-file /var/run/ceph/mon.4.pid -c /etc/ceph/ceph.conf ' Starting ceph-create-keys on machine4... Should I just rebuild the mon or is there an easy fix? Also why does ceph-create-keys get started over and over? I have 4 of them running now. Thanks James _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com