I think we should also consider the impact of the change. It is a different thing when you are changing a perf counter and when you are changing a metadata metric that is likely to break way more things. It would be nice if we could at least keep a certain level of backwards compatibility in these metadata metrics -- i.e. we could add a couple of labels but probably should not remove them? That is also why I suggested keeping the old ceph_quorum_count metric. It is a fairly common metric that can be used widely. However, if we backport the name change to luminous, I'm ok with using ceph_quorum_status instead. -boris btw: I apologize if you received this message twice, the original message was rejected by kernel.org because it was in html. On Wed, Feb 28, 2018 at 2:00 PM, Jan Fajerski <jfajerski@xxxxxxxx> wrote: > On Wed, Feb 28, 2018 at 12:26:15PM +0000, John Spray wrote: >> >> Inevitably, we're starting to hit cases where we have to think about >> compatibility when making changes to the prometheus output (same >> issues will apply to changes to perf counters that are passed >> through): >> https://github.com/ceph/ceph/pull/20506#issuecomment-368806208 >> >> For the moment, we don't have any policy around this, so I anticipate >> things changing at will until the point that we make a policy, which >> might naturally coincide with introducing the in-tree grafana >> dashboards (because at that point the prometheus output will be >> demonstrably reasonably complete). >> >> Thinking about what kind of policy we want, the extremes would be: >> - Do nothing: all counters can change at will (even though most of the >> time they won't) >> - Match Ceph protocol interop: all changes would be backwards >> compatible through two versions >> >> We will soon have some official grafana dashboards in the Ceph tree, >> which I anticipate most people using, but there will certainly be >> people who craft their own dashboards too. I'm hoping that the >> vendors shipping Ceph-based products will all be working with the >> in-tree dashboards, so this is probably more a topic of concern to >> large scale users. >> >> I think it's reasonable for people with custom dashboards to expect >> that we not knowingly break them with updates to our stable branches: >> that's a pretty easy thing for us to accomplish. > > No question about this. >> >> >> The part that's probably more debatable is: should someone with an >> external dashboard built for luminous expect it to work seamlessly on >> mimic? I would say probably not. Because we make major internal >> changes between major releases, it's expected that various performance >> counters would go away or change in meaningful ways. >> >> Any thoughts? > > I think non-breaking changes within a stable branch and possibly-breaking > changes from one stables release to another is a decent policy. We can > probably add a section to the release notes that lists breaking changes and > maybe even hints as to what which queries can replace non-functioning ones. > This should be straight forward in most cases, since we would usually change > metrics to add functionality, not remove it.. >> >> >> John >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > Jan Fajerski > Engineer Enterprise Storage > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html