Re: looking for suggestions for managing a tree of server configs

david@xxxxxxx · Sun, 21 Oct 2012 16:33:44 -0700 (PDT)

On Sun, 21 Oct 2012, Drew Northup wrote:

On Sat, Oct 20, 2012 at 10:34 PM,  <david@xxxxxxx> wrote:
On Sat, 20 Oct 2012, Drew Northup wrote:
On Sun, Oct 14, 2012 at 12:57 AM,  <david@xxxxxxx> wrote:
On Sat, 13 Oct 2012, Junio C Hamano wrote:
david@xxxxxxx writes:

today I have just a single git tree covering everything, and I make a
commit each time one of the per-server directories is updated, and
again when the top-level stuff is created.

if a large portion of the configuration for these servers are
shared, it might not be a bad idea to have a canonical "gold-master"
configuration branch, to which the shared updates are applied, with
a branch per server that forks from that canonical branch to keep
the machine specific tweaks

In an ideal world yes, but right now these machines are updated by many
different tools (unforuntantly including 'vi'), so
these directories aren't the config to be pushed out to the boxes, it's
instead an archived 'what is', the result of changes from all the tools.

So you need to save what is there before pulling changes from the
master. That's no different from doing development work on an active
code base.

I think I've done a poor job of explaining my problem.

I'm not looking for tips on how to manage the systems themselves, I'm 
looking for suggestions on how to manage this data that I'm already 
gathering on this reporting server.

I have the problem that different departments have their own (different) 
preferred tools for implementing changes. There are 6 different 
departments that need to be involved with a single system to build and 
maintain it. Each department has their 'standard' way of doing things. At 
least two of these departments are using different, central configuration 
(i.e. puppet like) tools.

As a result, I am not looking to pull changes from the central location. 
I'm just trying to gather information and be able to produce reports about 
the systems (Including "This is what all the different configs files on 
this server were like at time X"). I'm not using the distributed features 
of git at this time.

I've got existing tools that do a very similar job to what it sounds like 
sysconfcollect does that gather the non-sensitive info from all my remote 
machines and sends the data to my central server. These tools send an 
update whenever 'significant' changes are made, and in addition do a 
scheduled update to catch less significant changes.

On my central server I have the directory configs-current that then has a 
subdirectory details/systemname for each system that contains all the 
information about htat system (populated by scripts that parse apart the 
data mentioned above)

In other files and directories in configs-current I have lots of more 
global data and reports. This includes things like a report of every 
interface on every machine, the IP address, does it have link, what speed 
is it at, etc.

Right now I have one git tree for configs-current and each time I update a 
details/systemname tree I do

git add -a configs-current/details/$systemname
git commit -m'system update from $servername'

then when I run the summary scripts I do

git add -a configs-current
git commit -m'summary update'

This has been working for a few years

However, trying to go back in history to find a change on one system is a 
pain.

Right now the updates accumulate until I manually trigger a processing 
cycle to update the files. I would like to make it so that the updates to 
each system's details/systemname directory is done automatically as the 
e-mail from that system arrives, and this could result in parallel 
updates. I don't think that git will handle this well in one tree with the 
existing process (different processes doing git add and git commits in 
parallel will end up mixing their data)

As one big tree, this has lots of commits (a couple hundred each update), 
and this is making it slow to try and track changes to a particular file 
in a particular system.

I'm thinking that splitting the history tracking per-server should make 
everything faster.

I'm wondering if I should do a subproject for each details/systemname 
directory, or if there is something else I can do to make this tracking of 
the data better.

Doing a single repository with lots of branches doesn't seem like it would 
work as I need to get at the data from all the branches at the same time. 
I guess I could do something with branches on one repository, with a 
different worktree for each system, but that seems a bit fragile (one 
command with the wrong environment variables and it coudl really tangle 
things up)

David Lang
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html