Which logs? I'm assuming /var/log/salt/minon since the rest on the minions are relatively empty. Possibly Cthulhu from the master? I'm running on Ubuntu 14.04 and don't have an httpd service. I had been start/stopping apache2. Likewise there is no supervisord service and I've been using supervisorctl to start/stop Cthulhu. I've performed the calamari-ctl clear/init sequence more than twice with also stopping/starting apache2 and Cthulhu. > -----Original Message----- > From: Gregory Meno [mailto:gmeno@xxxxxxxxxx] > Sent: Tuesday, May 12, 2015 5:58 PM > To: Bruce McFarland > Cc: ceph-calamari@xxxxxxxxxxxxxx; ceph-users@xxxxxxxx; ceph-devel > (ceph-devel@xxxxxxxxxxxxxxx) > Subject: Re: [ceph-calamari] Does anyone understand Calamari?? > > All that looks fine. > > There must be some state where the cluster is known to calamari and it is > failing to actually show it. > > If you have time to debug I would love to see the logs at debug level. > > If you don’t we could try cleaning out calamari’s state. > sudo supervisorctl shutdown > sudo service httpd stop > sudo calamari-ctl cl—yes-i-am-sure > sudo calamari-ctl initialize > ca > then > sudo service supervisord start > sudo service httpd start > > see what the API and UI says then. > > regards, > Gregory > > On May 12, 2015, at 5:18 PM, Bruce McFarland > <Bruce.McFarland@xxxxxxxxxxxxxxxx> wrote: > > > > Master was ess68 and now it's essperf3. > > > > On all cluster nodes the following files now have 'master: essperf3' > > /etc/salt/minion > > /etc/salt/minion/calamari.conf > > /etc/diamond/diamond.conf > > > > The 'salt \* ceph.get_heartbeats' is being run on essperf3 - heres a 'salt \* > test.ping' from essperf3 Calamari Master to the cluster. I've also included a > quick cluster sanity test with the output of ceph -s and ceph osd tree. And for > your reading pleasure the output of 'salt octeon109 ceph.get_heartbeats' > since I suspect there might be a missing field in the monitor response. > > > > oot@essperf3:/etc/ceph# salt \* test.ping > > octeon108: > > True > > octeon114: > > True > > octeon111: > > True > > octeon101: > > True > > octeon106: > > True > > octeon109: > > True > > octeon118: > > True > > root@essperf3:/etc/ceph# ceph osd tree > > # id weight type name up/down reweight > > -1 7 root default > > -4 1 host octeon108 > > 0 1 osd.0 up 1 > > -2 1 host octeon111 > > 1 1 osd.1 up 1 > > -5 1 host octeon115 > > 2 1 osd.2 DNE > > -6 1 host octeon118 > > 3 1 osd.3 up 1 > > -7 1 host octeon114 > > 4 1 osd.4 up 1 > > -8 1 host octeon106 > > 5 1 osd.5 up 1 > > -9 1 host octeon101 > > 6 1 osd.6 up 1 > > root@essperf3:/etc/ceph# ceph -s > > cluster 868bfacc-e492-11e4-89fa-000fb711110c > > health HEALTH_OK > > monmap e1: 1 mons at {octeon109=209.243.160.70:6789/0}, election > epoch 1, quorum 0 octeon109 > > osdmap e80: 6 osds: 6 up, 6 in > > pgmap v26765: 728 pgs, 2 pools, 20070 MB data, 15003 objects > > 60604 MB used, 2734 GB / 2793 GB avail > > 728 active+clean > > root@essperf3:/etc/ceph# > > > > root@essperf3:/etc/ceph# salt octeon109 ceph.get_heartbeats > > octeon109: > > ---------- > > - boot_time: > > 1430784431 > > - ceph_version: > > 0.80.8-0.el6 > > - services: > > ---------- > > ceph-mon.octeon109: > > ---------- > > cluster: > > ceph > > fsid: > > 868bfacc-e492-11e4-89fa-000fb711110c > > id: > > octeon109 > > status: > > ---------- > > election_epoch: > > 1 > > extra_probe_peers: > > monmap: > > ---------- > > created: > > 2015-04-16 23:50:52.412686 > > epoch: > > 1 > > fsid: > > 868bfacc-e492-11e4-89fa-000fb711110c > > modified: > > 2015-04-16 23:50:52.412686 > > mons: > > ---------- > > - addr: > > 209.243.160.70:6789/0 > > - name: > > octeon109 > > - rank: > > 0 > > name: > > octeon109 > > outside_quorum: > > quorum: > > - 0 > > rank: > > 0 > > state: > > leader > > sync_provider: > > type: > > mon > > version: > > 0.86 > > ---------- > > - 868bfacc-e492-11e4-89fa-000fb711110c: > > ---------- > > fsid: > > 868bfacc-e492-11e4-89fa-000fb711110c > > name: > > ceph > > versions: > > ---------- > > config: > > 87f175c60e5c7ec06c263c556056fbcb > > health: > > a907d0ec395713369b4843381ec31bc2 > > mds_map: > > 1 > > mon_map: > > 1 > > mon_status: > > 1 > > osd_map: > > 80 > > pg_summary: > > 7e29d7cc93cfced8f3f146cc78f5682f > > root@essperf3:/etc/ceph# > > > > > > > >> -----Original Message----- > >> From: Gregory Meno [mailto:gmeno@xxxxxxxxxx] > >> Sent: Tuesday, May 12, 2015 5:03 PM > >> To: Bruce McFarland > >> Cc: ceph-calamari@xxxxxxxxxxxxxx; ceph-users@xxxxxxxx; ceph-devel > >> (ceph-devel@xxxxxxxxxxxxxxx) > >> Subject: Re: [ceph-calamari] Does anyone understand Calamari?? > >> > >> Bruce, > >> > >> It is great to hear that salt is reporting status from all the nodes > >> in the cluster. > >> > >> Let me see if I understand your question: > >> > >> You want to know what conditions cause us to recognize a working > cluster? > >> > >> see > >> > https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/ > >> manager.py#L135 > >> > >> > https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/ > >> manager.py#L349 > >> > >> and > >> > >> > https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/ > >> c > >> luster_monitor.py > >> > >> > >> Let’s check that you need to be digging into that level of detail: > >> > >> You switched to a new instance of calamari and it is not recognizing > >> the cluster. > >> > >> You what to know what you are overlooking? Would you please clarify > >> with some hostnames? > >> > >> i.e. Let say that your old calamari node was called calamariA and > >> that your new node is calamariB > >> > >> from which are you running the get_heartbeats? > >> > >> what is the master setting in the minion config files out on the > >> nodes of the cluster if things are setup correctly they would look like this: > >> > >> [root@node1 shadow_man]# cat /etc/salt/minion.d/calamari.conf > >> master: calamariB > >> > >> > >> If this is the case the thing I would check is the > >> http://calamariB/api/v2/cluster endpoint is reporting anything? > >> > >> hope this helps, > >> Gregory > >> > >>> On May 12, 2015, at 4:34 PM, Bruce McFarland > >> <Bruce.McFarland@xxxxxxxxxxxxxxxx> wrote: > >>> > >>> Increasing the audience since ceph-calamari is not responsive. What > >>> salt > >> event/info does the Calamari Master expect to see from the ceph-mon > >> to determine there is an working cluster? I had to change servers > >> hosting the calamari master and can’t get the new machine to recognize > the cluster. > >> The ‘salt \* ceph.get_heartbeats’ returns monmap, fsid, ver, epoch, > >> etc for the monitor and all of the osd’s. Can anyone point me to docs > >> or code that might enlighten me to what I’m overlooking? Thanks. > >>> _______________________________________________ > >>> ceph-calamari mailing list > >>> ceph-calamari@xxxxxxxxxxxxxx > >>> http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com