Re: monitoring

Sage Weil <sweil@xxxxxxxxxx> · Wed, 27 Nov 2019 22:01:01 +0000 (UTC)

Adding dev list.  We haven't talked through much of this in any detail in
the orchestrator calls yet aside from a vague discussion about what 
should/shouldn't be in scope.

On Thu, 28 Nov 2019, Paul Cuzner wrote:
> On Thu, Nov 28, 2019 at 2:37 AM Sage Weil <sweil@xxxxxxxxxx> wrote:
> 
> > On Wed, 27 Nov 2019, Paul Cuzner wrote:
> > > Hi,
> > >
> > > I've got a working gist for the add/remove of the monitoring solution.
> > > https://gist.github.com/pcuzner/ac542ce3fa9a4699bb9310b1fd5095d0
> > >
> > >  I'm out for the next couple of days, but will get a PR raised next week
> > to
> > > get this started properly.
> >
> > For some reason it won't let me comment on that gist.
> >
> > - I don't think we should install anything on the host outside of the unit
> > file and /var/lib/ceph/$fsid/$thing.  I suggest $thing be 'prometheus',
> > 'alertmanager', 'node-exporter', 'grafana'.  We could combine all but
> > node-exporter into a single 'monitoring' thing but i'm worried this
> > obscures things too much when, for example, the user might have an
> > external prometheus but still need alertmanager, and so on.
> >
> > So all the configs should live in
> > /var/lib/ceph/$fsid/$thing/prometheus.yml and so on, and then bound to the
> > right /etc/whatever location by the container config.
> >
> 
> I struggle with this one. Channelling my inner sysadmin: "I expect config
> settings to be in /etc and data to be in /var/lib - that's what FHS says
> and that's how other systems look that I have to manage, so why does Ceph
> have to do things differently?"

1- Because it's a containerized service.  Things are in etc inside the 
container, not outside.  Sprinkling these configs in /etc mixes 
containerized service configs with the *host*'s configs, which seems very 
untidy to me.
2. Putting it all in /var/lib/ceph/whatever means it's find and 
clean up.

> I'm also not sure of the value of fsid in the dir names. I can see the
> value if a host has to support multiple ceph clusters - but outside dev is
> that something that the community or our customers actually want?

Most deployments won't need it, but it will avoid a whole range of 
problems when they do.  Especially when it becomes trivial to bootstrap 
clusters, you also make it trivial to make multiple clusters overlap on 
the same host. 

And, like above, it keeps things tidy.

> The gist downloads the separate containers we need in parallel - which I
> think is a good thing! reduces time

Sure... that's something we could do regardless of whether it's a separate 
script of part of ceph-daemon.  Probably what we actually want is for the 
ssh 'host add' commadn to kick off some prestaging of containers in the 
background so that the first daemon deployment doesn't wait for a 
container download at all.

> IMO, having monitoring-add deploy grafana/prom and alert manager together
> by default is the way to go. TBH, when I started this, I was putting them
> all in the same pod under podman for management and treat them as a single
> unit - but having to support 'legacy' docker put an end to that :)
> 
> If a user wishes to use a separate prometheus, that will normally have it's
> own alertmanager too. Which alertmanager a prometheus server is defined in
> the prometheus.yml. With external prometheus, rules, alerts and receiver
> definitions are going to be an exercise for the reader. We'll need to
> document the settings, but the admin will need to apply them - in this
> scenario, we could possibly generate sample files that the admin can pick
> up and apply? To my mind deployment of monitoring has two pathways;
> default - "monitoring add" yields prom/grafana/alertmanager containers
> deployed to machine
> external-prom - "monitoring add" just deploys grafana, and points it's
> default data source at the external prom url. We're also making an
> assumption here that the prometheus server is open and doesn't require auth
> (OCP's prometheus for example has auth enabled)

I think it makes sense to focus on the out-of-the-box opinionated easy 
scenario vs the DIY case, in general at least.  But I have a few 
questions...

- In the DIY case, does it makes sense to leave the node-exporter to the 
reader too?  Or might it make sense for us to help deploy the 
node-exporter, but they run the external/existing prometheus instance?

- Likewise, the alertmanager is going to have a bunch of ceph-specific 
alerts configured, right?  Might they want their own prom but we deploy 
our alerts?  (Is there any dependency in the dashboard on a particular set 
of alerts in prometheus?)

I'm guessing you think no in both these cases...

> > - Let's teach ceph-daemon how to do this, so that you do 'ceph-daemon
> > deploy --fsid ... --name prometheus.foo -i input.json'.  ceph-daemon
> > has the framework for opening firewall ports etc now... just add ports
> > based on the daemon type.
> >
> 
> TBH, I'd keep the monitoring containers away from the ceph daemons. They
> require different parameters, config files etc so why not keep them
> separate and keep the ceph logic clean. This also allows us to change
> monitoring without concerns over logic changes to normal ceph daemon
> management.

Okay, but mgr/ssh is still going to be wired up to deploy these. And to do 
so on a per-cluster, containerized basis... which means all of the infra 
in ceph-daemon will still be useful.  It seems easiest to just add it 
there.

Your points above seem to point toward simplifying the containers we 
deploy to just two containers, one that's one-per-cluster for 
prom+alertmanager+grafana, and one that's per-host for the node-exporter.  
But I think making it fit in nicely with the other ceph containers (e.g., 
/var/lib/ceph/$fsid/$thing) makes sense.  Esp since we can just deploy 
these during bootstrap by default (unless some --external-prometheus is 
passed) and this all happens without the admin having to think about it.

> > WDYT?
> >
> >
> I'm sure a lot of the above has already been discussed at length with the
> SuSE folks, so apologies for going over ground that you've already covered.

Not yet! :)

sage
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx