Re: config on mons

Mark Nelson <mnelson@xxxxxxxxxx> · Tue, 14 Nov 2017 08:33:05 -0600

On 11/14/2017 05:36 AM, John Spray wrote:
On Tue, Nov 14, 2017 at 10:18 AM, Piotr Dałek <piotr.dalek@xxxxxxxxxxxx> wrote:
On 17-11-13 07:40 PM, John Spray wrote:

On Mon, Nov 13, 2017 at 6:20 PM, Kyle Bader <kyle.bader@xxxxxxxxx> wrote:

Configuration files are often driven by configuration management, with
previous versions stored in some kind of version control systems. We
should make sure that if configuration moves to the monitors that you
have some form of history and rollback capabilities. It might be worth
modeling it similar to network switch configuration shells, a la
Junos.

* change configuration
* require commit configuration change
* ability to rollback N configuration changes
* ability to diff to configuration versions

That way an admin can figure out when the last configuration change
was, what changed, and rollback if necessary.

That is an extremely good idea.

As a minimal thing, it should be pretty straightforward to implement a
snapshot/rollback.

https://thedailywtf.com/articles/The_Complicator_0x27_s_Gloves

I imagine many users today are not so disciplined as to version
control their configs, but this is a good opportunity to push that as
the norm by building it in.

Using Ceph on any decent scale actually requires one to use at least Puppet
or similar tool, I wouldn't add any unnecessary complexity to already
complex code just because of novice users that are going to have hard time
using Ceph anyway once a disk breaks and needs to be replaced, or when
performance goes to hell because users are free to create and remove
snapshots every 5 minutes.

All of the experienced users were novice users once -- making Ceph
work well for those people is worthwhile.  It's not easy to build
things that are easy enough for a newcomer but also powerful enough
for the general case, but it is worth doing.

When we have to trade internal complexity vs. complexity at
interfaces, it's generally better to keep the interfaces simple.

I've seen too many examples both in our code and in other projects where 
that kind of internal complexity leaks out and makes things worse.  If 
we want to reduce complexity we need to reduce complexity.  I'm not 
against having the mon to centrally report state.  I think it's a great 
idea.  Management I'm not sold on, see below.

Currently a Ceph cluster with 1000 OSDs has 1000 places to input the
configuration, and no one place that a person can ask "what is setting
X on my OSDs?".  Even when they look at a ceph.conf file, they can't
be sure that those are really the values in use (has the service
restarted since the file was updated?) or that they will ever be (are
they invalid values that Ceph will reject on load?).

How many folks with 1000 OSD clusters are manually managing 
configuration files though?  These are the kinds of customers that have 
dedicated linux/storage administrators on staff that have preferences 
regarding how they do things.  When I was managing distributed storage 
systems few things angered me more than trying to deal with each storage 
vendor's custom management systems.  I was never particularly concerned 
with being able to manage (user-facing) state on my own.  What I was 
*very* concerned about was bug-ridden code that got shipped out at the 
last minute so the vendor could checkbox a feature that I couldn't 
easily work around.  There was a particular vendor's Lustre HA 
management/stonith solution that comes to mind.  They weren't the only 
one though.  We had a variety of interesting and horrific issues with 
other non-lustre storage too.  The worst cases were the ones where the 
solution could have been fast/easy but we had to go through all kinds of 
gymnastics to circumvent the vendor's bad behavior.

The "dump a text file in /etc" interface looks simple on the face of
it, but is actually quite complex when you look to automate a Ceph
cluster from a central user interface, or build more intelligence into
Ceph for avoiding dangerous configurations.  It's also painful for
non-expert users who are required to type precisely correct syntax
into that text file.

This feels a bit like a proxy war over whether we are designing a 
storage appliance or a traditional linux style service.  I'm not 
convinced we can do both well at the same time.  If we want both, maybe 
we need to think about each as independent products with their own 
goals/management/code/etc.

And I can already imagine clusters breaking down once config
database/history breaks for whatever reason, including early implementation
bugs.

Distributing configs through mon isn't bad idea by itself, I can imagine
having changes to runtime-changeable settings propagated to OSDs without the
need for extra step (actually injecting them) and without the need for
restart, but for anything else, there are already good tools and I see no
value in trying to mimic them.

Remember that the goal here is not to just invent an alternative way
of distributing ceph.conf.  Even Puppet is overkill for that!  The
goal is to change the way configuration is defined in Ceph, so that
there is a central point of truth for how the cluster is configured,
which will enable us to create a user experience that is more robust,
and an interface that enables building better interactive tooling on
top of Ceph.

When it comes to using something like Puppet as that central point of
truth, there are two major problems with that:
 - If someone wants to write a GUI, they would need to integrate with
your Puppet, someone else's Chef, someone else's Ansible, etc -- a lot
of work, and in many cases the interfaces for doing it don't even
exist (believe me, I've tried writing dashboards that drove Puppet in
the past).
 - If Ceph wants to validate configuration options, and say "No, that
setting is no good" when someone tries to change something, we can't,
because we're not hooked in to Puppet at the point that the user is
changing the setting.

The ultimate benefit to you is that by making Ceph easier to use, we
grow our community, and we grow the population of people who want to
invest in Ceph (all of it, not just the new user friendly bits).

John
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html