On Wed, 4 Dec 2019, Paul Cuzner wrote: > Interesting discussion - but I don't want to lose sight of the original > questions > > ceph-deamon make several deployment decisions at the moment that differs > from existing deployment patterns. This is the first point that I wanted to > raise. > - it assumes that from Octopus onwards, the only deployment pattern we > provide is container only. > - it places all of Ceph's files (config and data) within /var/lib. In the > past even with containers, we've still used /etc for config to align with > FHS and, since the OS is package based, config from other packages adheries > to FHS anyway - which makes Ceph different. > - it uses fsid in path names and container names, just in case users want > to run multiple ceph clusters on the same machine. IMO this adds > complication to 100% of deployments, that may benefit 5% of the user base > (numbers plucked out of the air on that one!) > > Perhaps, all of these design points trace back to a single idea - support > multiple ceph clusters on the same set of machine(s). Is this the goal? Is > this want Ceph users want? It was one of my goals. A few reasons: - It's easy and clean. - These users do exist. - When we deprecated this behaviour before, our justification was "you should be using containers". Well, here we are. - Rook allows this (with the (current) caveat that you can't put mons from multiple clusters on the same host/IP if they're using default ports). - The paths for rook are also convoluted like this, nested under the kubernetes namespace name. - The pain of weird paths is mitigated when you use enter a containers (or use the shell container). - tab-completion works both for path names and systemd service names. Also, ceph-daemon shell and similar commands will figure out the fsid themselves when there is a single cluster on the host. > Now picking up on the scope issue for the orchestrator - apologies if this > sounds like a manifesto...I'm a "usability" addict! > > IMO, our collective goal should be to drive ease of use and Ceph adoption > beyond Linux geeks. If that's a view that resonates, I think the > orchestrator has critical role to play to enable that strategy > Personally I'd would like to see the orchestrator evolve over time to > become the automation engine that enables an open source ecosystem around > Ceph; > - provide a default implementation for monitoring/alerting/metrics - this > can be simple and doesn't need HA - as Sage has already mentioned > - samba/ganesha deployment, loud balancers to improve radosgw etc etc > - integration with platform management (why not show in the ceph dashboard > whether you have patches outstanding against your host, or the host has a > bad PSU) - enable the sysadmin to work more efficiently on Ceph, and maybe > they'd prefer it over other platforms. > > We absolutely still need to support DIY configurations - but having a > strategy that delivers a better out-of-the-box Ceph experience is surely > our goal. > > </soapbox> +1 sage > > > > > On Fri, Nov 29, 2019 at 8:22 PM Jan Fajerski <jfajerski@xxxxxxxx> wrote: > > > On Thu, Nov 28, 2019 at 02:26:36PM +0000, Sage Weil wrote: > > --snip-- > > >> >I think it makes sense to focus on the out-of-the-box opinionated easy > > >> >scenario vs the DIY case, in general at least. But I have a few > > >> >questions... > > >> I think this focus will leave some users in the dust. Monitoring with > > prometheus > > >> can get complex, especially if it is to be fault tolerant (which imho is > > >> important for confidence in such a system). Also typically users don't > > want > > >> several monitoring systems in their environment. So let's keep the case > > of > > >> existing prometheus systems in mind please. > > > > > >That's what I want meant by 'vs' above... perhaps I should have said 'or'. > > >Either we deploy something simple and opinionated, or the user attaches to > > >their existing or self-configured setup. We don't probably need to worry > > >about the various points in the middle ground where we manage only part of > > >the metrics solution. > > > > I'm not sure we'll get off this easy. At the very least the prometheus mgr > > module is deployed by us. There is also an argument to be made for > > monitoring > > the things that we take control over, i.e. the containers we deploy (one > > node_exporter per container is a common setup) and maybe even the hosts > > that the > > orchestrator provisions. > > > > > > > >(Also, I'm trying to use 'metrics' to mean prometheus etc, vs 'monitoring' > > >which in my mind is nagios or pagerduty or whatever and presumably has a > > >level of HA required, and/or needs to be external instead of baked-in.) > > > > Not sure I understand that distinction. You mean metrics for the > > prometheus > > setup the orchestrator intents to install? (prometheus can certainly be a > > fully > > fledged monitoring stack). > > > > Jan > > > > > >sage > > > > > > > > >> > > > >> >- In the DIY case, does it makes sense to leave the node-exporter to > > the > > >> >reader too? Or might it make sense for us to help deploy the > > >> >node-exporter, but they run the external/existing prometheus instance? > > >> > > > >> >- Likewise, the alertmanager is going to have a bunch of ceph-specific > > >> >alerts configured, right? Might they want their own prom but we deploy > > >> >our alerts? (Is there any dependency in the dashboard on a particular > > set > > >> >of alerts in prometheus?) > > >> > > > >> >I'm guessing you think no in both these cases... > > >> > > >> What I'm missing from proposals I've seen so far is an interface to > > query the > > >> orchestrator for various prometheus bits. First and foremost the > > orchestrator > > >> should have a command that returns a prometheus file_sd_config of > > exporters that > > >> an external prometheus stack should scrape. Whether this is just the mgr > > >> exporter or also node_exporters (or others) depends on how far the > > orchestrator > > >> will take control. > > >> Alerts are currently handled as an rpm but could certainly be provided > > through a > > >> similar interface. > > >> > > >> At the very least, if the consensus will be that the orchestrator > > absolutely has > > >> to deploy everything itself, please at least provide an interface so > > that a > > >> federated setup is easily possible (an external prometheus scraping the > > >> orch-deployed prometheus) so that users don't have to care what the > > orchestrator > > >> does with monitoring (other then duplicating recorded metrics). See > > >> > > https://prometheus.io/docs/prometheus/latest/federation/#hierarchical-federation > > >> > > >> I'd really like to encourage the orchestrator team to carefully think > > this > > >> through. Monitoring is (at least for some users) a critical > > infrastructure > > >> component with its own inherent complexity. I'm worried that just doing > > this in > > >> a best-effort fashion and not offering an alternative path if going to > > weaken > > >> the ceph ecosystem. > > >> > > > >> >> > - Let's teach ceph-daemon how to do this, so that you do > > 'ceph-daemon > > >> >> > deploy --fsid ... --name prometheus.foo -i input.json'. > > ceph-daemon > > >> >> > has the framework for opening firewall ports etc now... just add > > ports > > >> >> > based on the daemon type. > > >> >> > > > >> >> > > >> >> TBH, I'd keep the monitoring containers away from the ceph daemons. > > They > > >> >> require different parameters, config files etc so why not keep them > > >> >> separate and keep the ceph logic clean. This also allows us to change > > >> >> monitoring without concerns over logic changes to normal ceph daemon > > >> >> management. > > >> > > > >> >Okay, but mgr/ssh is still going to be wired up to deploy these. And > > to do > > >> >so on a per-cluster, containerized basis... which means all of the > > infra > > >> >in ceph-daemon will still be useful. It seems easiest to just add it > > >> >there. > > >> > > > >> >Your points above seem to point toward simplifying the containers we > > >> >deploy to just two containers, one that's one-per-cluster for > > >> >prom+alertmanager+grafana, and one that's per-host for the > > node-exporter. > > >> >But I think making it fit in nicely with the other ceph containers > > (e.g., > > >> >/var/lib/ceph/$fsid/$thing) makes sense. Esp since we can just deploy > > >> >these during bootstrap by default (unless some --external-prometheus is > > >> >passed) and this all happens without the admin having to think about > > it. > > >> > > > >> >> > WDYT? > > >> >> > > > >> >> > > > >> >> I'm sure a lot of the above has already been discussed at length > > with the > > >> >> SuSE folks, so apologies for going over ground that you've already > > covered. > > >> > > > >> >Not yet! :) > > >> > > > >> >sage > > >> >_______________________________________________ > > >> >Dev mailing list -- dev@xxxxxxx > > >> >To unsubscribe send an email to dev-leave@xxxxxxx > > >> > > >> -- > > >> Jan Fajerski > > >> Senior Software Engineer Enterprise Storage > > >> SUSE Software Solutions Germany GmbH > > >> Maxfeldstr. 5, 90409 Nürnberg, Germany > > >> (HRB 36809, AG Nürnberg) > > >> Geschäftsführer: Felix Imendörffer > > >> _______________________________________________ > > >> Dev mailing list -- dev@xxxxxxx > > >> To unsubscribe send an email to dev-leave@xxxxxxx > > >> > > >> > > > > > > -- > > Jan Fajerski > > Senior Software Engineer Enterprise Storage > > SUSE Software Solutions Germany GmbH > > Maxfeldstr. 5, 90409 Nürnberg, Germany > > (HRB 36809, AG Nürnberg) > > Geschäftsführer: Felix Imendörffer > > _______________________________________________ > > Dev mailing list -- dev@xxxxxxx > > To unsubscribe send an email to dev-leave@xxxxxxx > > > > >
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx