Re: nagios and ansible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "seth vidal" <skvidal@xxxxxxxxxxxxxxxxx>
> To: "infrastructure" <infrastructure@xxxxxxxxxxxxxxxxxxxxxxx>
> Sent: Monday, June 17, 2013 1:35:21 PM
> Subject: nagios and ansible
> 
> I've been thinking about how we should handle nagios in the ansible
> world.
> 
> our current nagios config in puppet has a number of issues:
> 1. it's a bit cumbersome b/c you edit nagios independent of adding the
> host's config
> 2. when you remove a host the nagios config doesn't automatically go
> away
> 3. the fqdn/vpn hostname thing between noc01 and noc02 is kinda a giant
> pain in the ass
> 4. dependencies between Networks->vhosts->guests are manual and
> irritating to maintain.
> 
> I'm open to suggestions about how to maintain all of this. Here are
> some ideas I've tinkered with:
> 
> 
> a. we stop putting nagios configs in the specific config mgmt entirely
> -and put it in another repo - like with dns. That doesn't make 1 or 2
> any better - but it could allow us to script to make 3 and 4 much better
> 
> b. we make populating nagios configs for hosts or services be a
> function of playbooking the host/group creation. So all of the nagios
> configs go on when you add the host. - tht doesn't solve 2 or 4 but
> maybe it does handle 1 and 3 a bit.
> 
> c. we make the nagios configs generate from the host inventory data
> that ansible can retrieve. It will require us to define a series of
> additional variables per host or per group. So when you add a new host
> you'll need to wait for a cron run or an ansible run against our nagios
> hosts to get them to see the new hosts. With enough effort I think we
> can tag all of 1, 2, 3 and 4 in creating THE whole set of nagios
> configs that way and rsyncing them over using the ansible-rsync module
> (or just rsync).
>   The problem with this one is that it seems like an all-or-nothing
>   scenario - we need to drive ALL of our nagios configs off of this or
>   none at all. With that in mind it seems like we would need to define
>   hosts as part of ansible even if they are still being managed by
>   puppet. That's extra work but I think it is work we'd have to do
>   eventually.
> 
>  So (c) would be something like this:
>   - take the list of hosts - look for a vmhost or if it is a cloud
>     instance - make that a dep
>   - look for a datacenter - make that a dep
>   - look for a vpn cert - make that a dep
>   - and on up the chain.
>   - look for any special service definitions that we'd be managing
>     manually
>   - put all of the hosts definitions in one big file so changing out
>     that file can be idempotent
>   - put service definitions in individual files - but have the files
>     rsynced over with --delete so removing one gets removed on the
>     nagios side, too
>  
> 
> Anyone have an option D we should think about? I'd like to hear about
> more

So I hate to say "what about not any of this as option D" but... have you looked at sensu at all?
http://sensuapp.org

While I cannot answer half the questions you have above :) - I can somewhat confidently relate the following items:

1) Designed as ... nagios... for the cloud... but doesn't suck
2) Automagically detects new hosts
3) Plugs into rabbitmq (and thus hopefully fedbus - not sure what flavor of amqp we are using there?)
4) Can re-use existing nagios plugins
5) Events can be passed to handlers - either stuff like making pretty pictures (graphite), pagerduty, etc. or triggering something else to happen (scripts, etc) 
6) Handles things like roles, users, etc; clients have (multiple) subscriptions, and will do checks based on what they subscribe to ("production" or "web" or "mailserver" or whatever you dream up)
7) Works with Puppet, Chef; I am not sure on ansible but there seems to be some random discussion on that capability when googling (admittedly I didn't look past "list of links")

I saw Joe Miller from Pantheon give a presentation on sensu a few weeks ago, slide link follows (2nd link) - they are using it, with Fedora, perhaps a reachout might be enlightening....
https://speakerdeck.com/joemiller/introduction-to-sensu
https://speakerdeck.com/joemiller/practical-examples-with-sensu-monitoring-framework

Also: http://docs.sensuapp.org/0.9/overview.html 

And, yes, ruby, meh. But checks, handlers can be written in any language.  No clue on how well it would handle all the network things you list but it does seem fairly flexible.

Anyway: People seem to be pretty happy with it for cloudy things... since it was designed with that in mind. Might be worth at least playing with or reading about?

-r


> 
> Thanks,
> -sv
> 
>   
> 
> _______________________________________________
> infrastructure mailing list
> infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
> https://admin.fedoraproject.org/mailman/listinfo/infrastructure
_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure





[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux