Master was ess68 and now it's essperf3. On all cluster nodes the following files now have 'master: essperf3' /etc/salt/minion /etc/salt/minion/calamari.conf /etc/diamond/diamond.conf The 'salt \* ceph.get_heartbeats' is being run on essperf3 - heres a 'salt \* test.ping' from essperf3 Calamari Master to the cluster. I've also included a quick cluster sanity test with the output of ceph -s and ceph osd tree. And for your reading pleasure the output of 'salt octeon109 ceph.get_heartbeats' since I suspect there might be a missing field in the monitor response. oot@essperf3:/etc/ceph# salt \* test.ping octeon108: True octeon114: True octeon111: True octeon101: True octeon106: True octeon109: True octeon118: True root@essperf3:/etc/ceph# ceph osd tree # id weight type name up/down reweight -1 7 root default -4 1 host octeon108 0 1 osd.0 up 1 -2 1 host octeon111 1 1 osd.1 up 1 -5 1 host octeon115 2 1 osd.2 DNE -6 1 host octeon118 3 1 osd.3 up 1 -7 1 host octeon114 4 1 osd.4 up 1 -8 1 host octeon106 5 1 osd.5 up 1 -9 1 host octeon101 6 1 osd.6 up 1 root@essperf3:/etc/ceph# ceph -s cluster 868bfacc-e492-11e4-89fa-000fb711110c health HEALTH_OK monmap e1: 1 mons at {octeon109=209.243.160.70:6789/0}, election epoch 1, quorum 0 octeon109 osdmap e80: 6 osds: 6 up, 6 in pgmap v26765: 728 pgs, 2 pools, 20070 MB data, 15003 objects 60604 MB used, 2734 GB / 2793 GB avail 728 active+clean root@essperf3:/etc/ceph# root@essperf3:/etc/ceph# salt octeon109 ceph.get_heartbeats octeon109: ---------- - boot_time: 1430784431 - ceph_version: 0.80.8-0.el6 - services: ---------- ceph-mon.octeon109: ---------- cluster: ceph fsid: 868bfacc-e492-11e4-89fa-000fb711110c id: octeon109 status: ---------- election_epoch: 1 extra_probe_peers: monmap: ---------- created: 2015-04-16 23:50:52.412686 epoch: 1 fsid: 868bfacc-e492-11e4-89fa-000fb711110c modified: 2015-04-16 23:50:52.412686 mons: ---------- - addr: 209.243.160.70:6789/0 - name: octeon109 - rank: 0 name: octeon109 outside_quorum: quorum: - 0 rank: 0 state: leader sync_provider: type: mon version: 0.86 ---------- - 868bfacc-e492-11e4-89fa-000fb711110c: ---------- fsid: 868bfacc-e492-11e4-89fa-000fb711110c name: ceph versions: ---------- config: 87f175c60e5c7ec06c263c556056fbcb health: a907d0ec395713369b4843381ec31bc2 mds_map: 1 mon_map: 1 mon_status: 1 osd_map: 80 pg_summary: 7e29d7cc93cfced8f3f146cc78f5682f root@essperf3:/etc/ceph# > -----Original Message----- > From: Gregory Meno [mailto:gmeno@xxxxxxxxxx] > Sent: Tuesday, May 12, 2015 5:03 PM > To: Bruce McFarland > Cc: ceph-calamari@xxxxxxxxxxxxxx; ceph-users@xxxxxxxx; ceph-devel > (ceph-devel@xxxxxxxxxxxxxxx) > Subject: Re: [ceph-calamari] Does anyone understand Calamari?? > > Bruce, > > It is great to hear that salt is reporting status from all the nodes in the > cluster. > > Let me see if I understand your question: > > You want to know what conditions cause us to recognize a working cluster? > > see > https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/ > manager.py#L135 > > https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/ > manager.py#L349 > > and > > https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/c > luster_monitor.py > > > Let’s check that you need to be digging into that level of detail: > > You switched to a new instance of calamari and it is not recognizing the > cluster. > > You what to know what you are overlooking? Would you please clarify with > some hostnames? > > i.e. Let say that your old calamari node was called calamariA and that your > new node is calamariB > > from which are you running the get_heartbeats? > > what is the master setting in the minion config files out on the nodes of the > cluster if things are setup correctly they would look like this: > > [root@node1 shadow_man]# cat /etc/salt/minion.d/calamari.conf > master: calamariB > > > If this is the case the thing I would check is the > http://calamariB/api/v2/cluster endpoint is reporting anything? > > hope this helps, > Gregory > > > On May 12, 2015, at 4:34 PM, Bruce McFarland > <Bruce.McFarland@xxxxxxxxxxxxxxxx> wrote: > > > > Increasing the audience since ceph-calamari is not responsive. What salt > event/info does the Calamari Master expect to see from the ceph-mon to > determine there is an working cluster? I had to change servers hosting the > calamari master and can’t get the new machine to recognize the cluster. > The ‘salt \* ceph.get_heartbeats’ returns monmap, fsid, ver, epoch, etc for > the monitor and all of the osd’s. Can anyone point me to docs or code that > might enlighten me to what I’m overlooking? Thanks. > > _______________________________________________ > > ceph-calamari mailing list > > ceph-calamari@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com